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Preface 


When I began thinking about and working on this second edition, it became 
clear early on that substantive additions to the first edition were in order. 
Although the optical principles upon which the earlier text was based have not 
changed, the ingenuity and resourcefulness of astronomers in the intervening 
years have led to many exciting new instrumental developments. These devel- 
opments, in turn, have meant a greatly increased efficiency in gathering data from 
celestial sources. As one example to illustrate this change, note the use of optical 
fibers to feed light from a hundred or more galaxies at a time into a spectrometer, 
rather than the traditional one galaxy at a time approach. 

Other dramatic developments within the past decade include implementing or 
planning for techniques of adaptive optics to compensate for the atmosphere, and 
the almost total adoption of solid-state detectors arrays. But the biggest change of 
all is only starting to become reality, that of a significant number of ground-based 
telescopes of near diffraction-limited quality and apertures greater than six meters 
in diameter. This greatly increased light gathering power will undoubtedly 
revolutionize observational astronomy. 

In view of these developments, and in response to the many comments I 
received on the first edition, my thrust in this rework has been two-fold. First, 
many portions of the text were rewritten or amended to make the explanations 
more clear and to correct errors. In some cases this meant adding additional 
material, such as spot diagrams or wavefront maps; in other cases words and 
figures were removed. Second, new sections were added to many chapters and 
one new chapter, on adaptive optics, was added. The overall format of the first 
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edition has not been changed, and I hope the reader will find the changes in this 
edition to be positive ones. 

As in the first edition, my intent is to emphasize basic principles of optics and 
how these principles are used in the designs of specific types of instruments. The 
treatment is limited to telescopes and cameras that use near-normal incidence 
optics and spectrometers with dispersive elements or interferometers. Numerous 
examples of system characteristics are given to illustrate the optical performance 
that can be expected. An outline of the topics covered is given in Chapter 1. 

The level of presentation and approach are appropriate for a graduate student 
in astronomy approaching the subject of astronomical optics for the first time. 
Although the basic principles of optics are discussed, it is assumed that the reader 
has the equivalent of an intermediate-level optics course at the undergraduate 
level. This book should also serve as a useful reference for active researchers. 

Because the presentation is not simply a compilation of types of telescopes 
and spectrometers, the reader should consult the original sources for details on 
specific instruments or telescopes. I have given an expanded bibliography and list 
of references, including conference proceedings, to facilitate further exploration. I 
have also added a table of symbols and their meanings as an aid to the reader. 

A number of persons contributed directly or indirectly to the writing of the first 
edition and this revision. First and foremost I thank Arthur Code, who gave me 
the opportunity of participating in the development of the Wisconsin Experiment 
Package of the first Orbiting Astronomical Observatory. Since that time I have 
been privileged to draw upon his wealth of knowledge and to teach jointly with 
him on one occasion a course on astronomical optics. For his contributions I am 
especially grateful. My thanks also to Arthur Hoag, Robert Bless, and Donald 
Osterbrock for their help and support over the years, and to Robert O’Dell for his 
encouragement to take part in NASA’s Hubble Space Telescope Project. 

Although many persons contributed to this rework, I mention only a few by 
name. Robert Lucke gave several pedagogical suggestions, especially on my 
discussion of distortion, that have been incorporated into the text. Derek Salmon 
asked some questions about misaligned telescopes and that section has been 
greatly expanded in this edition. The excellent book Reflecting Telescope Optics I 
by Raymond Wilson has been an important resource during the revision process. 
For their input, and the numerous other comments I have received, I am grateful. 

Finally, and most importantly, I acknowledge the support, encouragement, and 
patience of my wife LaVern while I worked on both editions of this book. 


Daniel J. Schroeder 
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Chapter 1 Introduction 


The increasing rate of growth in astronomical knowledge during the past few 
decades is a direct consequence of the increase in the number and size of 
telescopes and the efficiency with which they are used. Most celestial sources are 
intrinsically faint and observations with small refracting telescopes and insensi- 
tive photographic plates that required hours of observing time are now done in 
minutes with large reflecting telescopes and efficient solid-state detectors. The 
increased efficiency with which photons are collected and recorded by modern 
instruments has indeed revolutionized the field of observational astronomy. 


1.1. A BIT OF HISTORY 


Early in the 1900s the desire for larger light gathering power led to the design 
and construction of the 100-in Hooker telescope located on Mount Wilson in 
California. This reflecting telescope and its smaller predecessors were built 
following the recognition that refracting telescopes, such as the 36-in one at 
Lick Observatory in California and the 40-in one at Yerkes Observatory, in 
Wisconsin, had reached a practical limit in size. With the 100-in telescope, it was 
possible to start systematic observations of nearby galaxies and start to attack the 
problem of the structure of the universe. 

Although the 100-in telescope was a giant step forward for observational 
astronomy, it was recognized by Hale that still larger telescopes were necessary 
for observations of remote galaxies. Due largely to his efforts, work began on the 
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design and construction of a 200-in (5-m) telescope in the late 1920s. The Hale 
telescope was put into operation in the late 1940s and remained the world’s largest 
until a 6-m telescope was built in Russia in the mid-1970s. 

The need for more large telescopes became acute in the 1960s as the 
boundaries of observational astronomy were pushed outward. Plans made 
during this decade and the following one resulted in the construction of a 
number of optical telescopes in the 4-m class during the 1970s and 1980s in 
both hemispheres. These telescopes, equipped with efficient detectors, fueled an 
explosive growth in observational astronomy. 

Large reflectors are well-suited for observations of small parts of the sky, 
typically a fraction of a degree in diameter, but they are not suitable for surveys of 
the entire sky. A type of telescope suited for survey work was first devised by 
Schmidt in the early 1930s. The first large Schmidt telescope was a 1.2-m 
instrument covering a field about 6° across, and put into operation on Palomar 
Mountain in the early 1950s. Several telescopes of this type and size have since 
been built in both hemispheres. The principle of the Schmidt telescope has also 
been adapted to cameras used in many spectrometers. 

While construction of telescopes was underway during the 1970s and 1980s, 
astronomers were already planning for the next generation of large reflectors. In 
the quest for still greater light-gathering power, attention turned to the design of 
arrays of telescopes and segmented mirrors, and to new techniques for casting 
and figuring single mirrors with diameters in the 8-m range. The fruits of these 
labors became apparent in the late 1990s with the coming online of a significant 
number of telescopes in the 8- to 10-m class. 

The array concept was first implemented with the completion of the Multiple- 
Mirror Telescope (MMT) on Mount Hopkins, Arizona, a telescope with six 1.8-m 
telescopes mounted in a common frame and an aperture equivalent to that of a 
single 4.5-m telescope. Beams of the separate telescopes were directed to a 
common focal plane and either combined in a single image or placed side-by- 
side on the slit of a spectrometer. Although the MMT concept proved workable, 
advances in mirror technology prompted the replacement of the separate 
mirrors with a single 6.5-m mirror in the same telescope structure and 
building. 

The segmented mirror approach was the choice for the Keck Ten-Meter 
Telescope (TMT), with 36 hexagonal segments the equivalent of a single filled 
aperture. This approach requires active control of the positions of the segments to 
maintain mirror shape and image quality. Even before the first TMT had been 
pointed to its first star, its twin was under construction on Mauna Kea, Hawaii, 
and together these two telescopes are obtaining dramatic observational results. 
Another segmented mirror telescope is the Hobby-Eberly Telescope designed 
primarily for spectroscopy. 
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Although it seemed in the 1980s that multiple and segmented mirrors were the 
wave of the future, new techniques for making large, “fast” primary mirrors and 
controlling their optical figure in a telescope led to the design and construction of 
several 8-m telescopes. Among these are the Very Large Telescopes (VLT) of the 
European Southern Observatory, the Gemini telescopes, Subaru, and Large 
Binocular Telescope (LBT). Used singly or as components of an interferometric 
array (for the VLT and LBT), observations are possible that could only be 
dreamed of in the 1970s. 

Instrumentation used on large telescopes has also shown dramatic changes 
since the time of the earliest reflectors. Noting first the development in spectro- 
meters, small prism instruments were replaced by larger grating instruments at 
both Cassegrain and coude focus positions to meet the demands for higher 
spectral resolution. In recent years many of these high resolution coude instru- 
ments have, in turn, been replaced by echelle spectrometers at the Cassegrain 
focus. On the largest telescopes, such as the TMT and VLT, most large 
instrumentation is at the Nasmyth focus position on a platform that rotates 
with the telescope. Nearly all spectrographic instruments and imaging cameras 
now use solid-state electronic detectors of high quantum efficiency that, 
coupled with these telescopes, make possible observations of still fainter celestial 
objects. 

Although developments of ground-based optical telescopes and instruments 
during the last three decades of the 20th century have been dramatic, the same can 
also be said of Earth-orbiting telescopes in space. Since the first Orbiting 
Astronomical Observatory in the late 1960s, with its telescopes of 0.4-m and 
smaller, the size and complexity of orbiting telescopes have increased markedly. 
The 2.4-m Hubble Space Telescope (HST), once its problem of spherical 
aberration was fixed, has made observations not possible with ground-based 
telescopes. Although its light gathering power is significantly smaller than that of 
many ground-based telescopes, its unique capability of observing sources in 
spectral regions absorbed by our atmosphere and of imaging to the diffraction 
limit are leading the revolution in astronomy. 

Because of the high cost of a telescope in space, there has been significant 
effort to improve the quality of images of ground-based telescopes. These efforts 
include controlling the thermal conditions within telescope enclosures and 
incorporating active and adaptive optics systems into telescopes. With these 
techniques it becomes possible to obtain images of near-diffraction-limited 
quality, at least over small fields and for brighter objects. 

This brief excursion into the development of telescopes and instruments 
up to the present and into the near future is by no means complete. It is 
intended only to illustrate the range of tools now available to the observational 
astronomer. 
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1.2. APPROACH TO SUBJECT 


Most of the optical principles that serve as the starting point in the design and 
use of any optical instrument have been known for a long time. In intermediate- 
level optics texts these principles are usually divided into two categories: 
geometrical optics and physical optics. Elements from both of these fields are 
required for full descriptions of the characteristics of optical systems. 

The theory of geometrical optics is concerned with the paths taken by light 
rays as they pass through a system of lenses and/or mirrors. Although the ray 
paths can be calculated by simple application of the laws of refraction and 
reflection, a much more powerful approach is one that starts with Fermat’s 
Principle. With the aid of this approach it is possible to determine both the first- 
order characteristics of an optical system and deviations from these character- 
istics. The latter leads to the theory of aberrations or image defects, a subject to be 
discussed in detail. 

The theory of physical optics includes the effects of the finite wavelength of 
light and such topics as interference, diffraction, and polarization. Analyses of the 
characteristics of diffraction gratings, interferometers, and telescopes such as the 
Hubble Space Telescope require an understanding of these topics. The basics of 
this theory are introduced prior to our discussions of these types of optical 
systems. 

The approach, therefore, is to emphasize the basic principles of a variety of 
systems and to illustrate these principles with specific designs. Although the 
specifics of telescopes and instruments have changed, and will continue to 
change, the basic optical principles are the same. 


1.3. OUTLINE OF BOOK 


The 17 chapters that follow the Introduction can be grouped into six distinct 
categories. Chapters 2 through 5 cover the elements of geometrical optics needed 
for the discussion of optical systems. The first three chapters of this group are an 
introduction to this part of optics seen from the point of view of Fermat’s 
Principle, with Chapter 5 a detailed treatment of aberrations based on this 
principle. 

Chapters 6 through 11 cover the characteristics of a variety of telescopes and 
cameras, including auxiliary optics used with them. The characteristics of 
diffraction-limited telescopes are covered in the last two chapters of this group, 
with application to the Hubble Space Telescope. 


1.3. Outline of Book 5 


Chapters 12 through 15 are a discussion of the principles of spectrometry and 
their application to a variety of dispersing systems, with the emphasis on 
diffraction gratings. In this group Chapter 14 is the counterpart of Chapter 5, a 
treatment of grating aberrations from the point of view of Fermat’s Principle. 

The remaining three chapters (16, 17, and 18) are distinct in themselves with 
each chapter drawing upon results given in preceding chapters and applying these 
results to selected types of observations for both ground-based and space-based 
systems. 

A closer look at the contents of each chapter is now in order. Chapter 2 is an 
introduction to the basic ideas of geometrical optics, and the reader who is well 
versed in these ideas can cover it quickly. One topic covered in this chapter, not 
part of the usual course in optics, is the definition of normalized parameters for 
two-mirror telescopes. 

Chapter 3 is an introduction to Fermat’s Principle with a number of examples 
illustrating its utility, including a brief discussion of atmospheric refraction and 
atmospheric turbulence. Chapter 4 is an introduction to aberrations, with 
emphasis on spherical aberration. The concept of aberration compensation is 
introduced and applied to two optical systems. 

The discussions of the preceding three chapters set the stage for an in-depth 
discussion of the theory of third-order aberrations in Chapter 5. The results of the 
analysis are summarized in tables for easy reference. 

In Chapter 6 we draw on the results from Chapter 5 to derive the characteristics 
of a number of types of reflecting telescopes. Comparisons of image quality are 
given for several of these types, including examples of image quality for 
misaligned two-mirror telescopes. Chapter 7 covers the characteristics of Schmidt 
systems, including a discussion of the achromatic Schmidt and solid and semi- 
solid cameras. 

Chapter 8 covers various types of catadioptric systems, including Schmidt- 
Cassegrain telescopes and cameras with meniscus correctors substituted for 
aspheric plates. The following chapter (9) is a discussion of various types of 
auxiliary optics used with telescopes, including field lenses, field flatteners, prime 
and Cassegrain focus correctors, focal reducers, atmospheric dispersion correc- 
tors, and fiber optics. 

In Chapter 10 we discuss the basics of diffraction theory and aberrations and 
the characteristics of perfect and near-perfect images. Perfect and near-perfect 
images are discussed in terms of classical and orthogonal aberrations in Chapter 
10, followed by a discussion in terms of transfer functions in Chapter 11. The 
results are illustrated with a discussion of the optical characteristics of the Hubble 
Space Telescope, both expected before launch and as measured after launch. 

Chapter 12 covers the basic principles of spectrometry, followed by application 
of these principles to a variety of dispersing elements and systems in Chapter 13. 
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The following two chapters are devoted entirely to the diffraction grating, with 
Chapter 14 an analysis of grating aberrations and concave grating mountings and 
Chapter 15 the application of these results to a variety of plane grating 
instruments. 

Chapter 16 is an introduction to adaptive optics and the approach to correction 
of wavefront distortion due to atmospheric turbulence to restore image quality. In 
Chapter 17 we discuss detectors in terms of transfer functions and Nyquist 
sampling, signal-to-noise ratio (SNR), and the detection limits that are reached at 
a given SNR level for several types of observations. The final chapter is two 
separate topics: residual errors of real mirrors and effects of these errors on image 
quality, and diffraction-limited images given by telescope arrays. 

The reader approaching the topic of astronomical optics for the first time is 
encouraged to work through the basic theory. This exercise will facilitate the 
understanding of its application to a specific optical system and the bounds within 
which this system is usable. Other readers, on the other hand, will be interested 
only in specific systems and their characteristics. We hope that their needs are met 
with the tables and equations that are given. Whatever the motivation, a selected 
bibliography is given at the end of each chapter for additional reading. 

A more complete understanding of any optical system is achievable if an 
analysis using the basic theory is supplemented with data from one of the many 
optical design packages now available. Such packages generally provide a large 
number of analysis tools and can give the user a detailed picture of how an optical 
system will perform. Tasks ranging from simple tracing of rays to complete 
diffraction analysis are essential in the design of complex optical systems. 

In preparing the figures in this book, we have made extensive use of the optical 
design program ZEMAX from Focus Software, Inc. of Tucson. As a help to the 
reader, many of the optical systems used as examples in our discussions are 
available from the public free download part of the web site www.focus- 
software.com. An interested reader is encouraged to use the supplied design 
files as a starting point for further self-study of the examples in the text. 
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Optics 


The analysis of any optical system generally proceeds along a well-defined 
route. First one arrives at a basic layout of optical elements: lenses, mirrors, 
prisms, gratings, and such, by using first-order or Gaussian optics. Such an 
analysis establishes such basic parameters as focal length, magnification, and 
locations of pupils, among others. The next step often involves using a ray-trace 
program on a digital computer to trace rays through the system and calculate 
aberrations of the image. Such an analysis might dictate changes in the basic 
layout in order to achieve image quality within certain specified limits. Ray trace 
and optical analysis programs are now quite sophisticated and are particularly 
useful in systems with many optical elements. Tracing of rays is especially useful 
in optimizing system performance. 

In order to efficiently use the results generated by a ray-trace program it is 
necessary to understand the theory of third-order aberrations. In subsequent 
chapters we go into considerable detail on the nature of these aberrations and how 
they can be eliminated or minimized in different kinds of optical systems. In 
many cases an analysis of aberrations is a useful intermediate step following the 
setup of the basic system and the analysis using a ray-trace program. Details of 
how such programs work are not discussed. 

Each of the steps along this route requires a systematic approach to measure- 
ments of angles and distances. In this chapter we define the sign conventions used 
and determine the equations of first-order optics. We apply these equations to 
several systems including two-mirror telescope systems. 
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2.1. SIGN CONVENTIONS 


The coordinate system within which surface locations and ray directions are 
defined is the standard right-hand Cartesian frame shown in Fig. 2.1. For a single 
refracting or reflecting surface the z-axis coincides with the optical axis, with the 
origin of the coordinate system at the vertex O of the surface. For an optical 
system in which the elements are centered, the optical axis is the line of symmetry 
along which the elements are located. In a system in which one or more of the 
elements is not centered, the optical axis for such an element will not coincide 
with that for a different element, a complication that is dealt with later. In the 
following discussion only centered systems are considered. 

Figure 2.1 illustrates refraction at a spherical surface with an incident ray 
directed from left to right. Rays from an initial object are always assumed to travel 
in this direction. The indices of refraction are n and n’ to the left and right of the 
surface, respectively, with points B, B’, and C on the optical axis of the surface. 
The line PC is the normal to the interface between the two media at point P, and a 
ray directed toward B is refracted at P and directed toward B’. 

The unprimed symbols in Fig. 2.1 refer to the ray before refraction, while the 
primed symbols refer to the ray after refraction. The slope angles are u and v’, 
measured from the optical axis, and the angles of incidence and refraction, 
respectively, are i and i’, measured from the normal to the surface. The symbols s 
and s’ denote the object and image distances, respectively, and R represents the 
radius of curvature of the surface, measured at the vertex. 

The sign convention for distances is the same as for Cartesian geometry. 
Hence distances s, s’, and R are positive when the points B, B’, and C are to the 
right of the vertex, and distances from the optical axis are positive if measured 





Fig. 2.1. Refraction at spherical interface. All angles and distances are positive in diagram; see 
text for discussion. 
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upward. The sign convention for angles is chosen so that all of the angles shown 
in Fig. 2.1 are positive. Slope angles u and w’ are positive when a counter- 
clockwise rotation of the corresponding ray about B or B’ brings the ray into 
coincidence with the z-axis. The angles of incidence and refraction, i and i’, are 
positive when a clockwise rotation of the corresponding ray about point P brings 
the ray into coincidence with the line PC. All rotations are made through acute 
angles. 

The advantage of these conventions for distances and angles is that both 
refracting and reflecting surfaces can be treated with the same relations. As we 
show, formulas for reflecting surfaces are obtained directly by letting n’ = —n in 
the formulas derived for refracting surfaces. The meaning of a negative index of 
refraction is discussed in Section 2.3. 

The sign conventions for distances and angles are similar to those used by 
Born and Wolf (1980) and by Longhurst (1967). Although the conventions for 
angles may at times seem awkward, they have the advantage of universal 
applicability and are especially appropriate in third-order analysis of complex 
systems. 


2.2. PARAXIAL EQUATION FOR REFRACTION 


In this section we develop some of the basics needed for a first-order analysis 
of an optical system. It is worth noting that our discussion is not intended as a 
comprehensive one, and should more details be needed the reader should refer to 
any of a number of excellent texts in optics. Examples of such texts are those by 
Longhurst (1967), Hecht (1987), or Jenkins and White (1976). You should be 
aware, however, that the sign conventions used in the latter two of these books 
differ from that used here. 

With the help of Fig. 2.1 we can easily determine the relation between s and s’ 
when the distance y and all angles are small. By small we mean that point P is 
close enough to the optical axis so that sines and tangents of angles can be 
replaced with the angles themselves. In this approximation any ray is close to the 
axis and nearly parallel to it, hence the term paraxial approximation. 

The exact form of Snell’s law of refraction is 


nsini =n’ sini’, (2.2.1) 
which in the paraxial approximation becomes ni = n'i’. From Fig. 2.1 we find 


itu=4@¢, il +u =o. (2.2.2) 
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Solving these relations for i and i’, and substituting into the paraxial form of 
Snell’s law gives 


n'u — nu = (w —n)d. (2.2.3) 


Applying the paraxial approximation to the distances, we get ġ = y/R, u = y/s, 
and wu’ = y/s’. Substituting, and canceling the common factor y, we get 


non _n—n 

S s R 
The points at distances s and s’ from the vertex are called conjugate points, that is, 
the image is conjugate to the object and vice versa. If either s or s’ = œ, then the 
conjugate distance is the focal length, that is, s =f when s’ = œ and s’ = f’ 
when s = oo. 





(2.2.4) 


2.2.a. POWER 


In Eq. (2.2.4) we see that the right side of the equation contains factors relating 
to the surface and surrounding media, and not to the object and image. It is useful 
to denote this combination by P, where P is the power of the surface. The power is 
unchanged when the direction of light travel in Fig. 2.1 is reversed, provided n 
and n’ are interchanged and each is made negative. This invariance of P to the 
direction of light travel makes it a useful parameter. Note also that s and s’ change 
places when the light is reversed in Fig. 2.1, and Eq. (2.2.4) is unchanged. 

Combining Eq. (2.2.4) with the defined focal lengths and power we get 


non no-n n n 

eo SEE p= F (2.2.5) 
This is the first-order or Gaussian equation for a single refracting surface and is 
the starting point for analyzing systems that have several surfaces. For multi- 
surface systems the image formed by a given surface, say the ith one, serves as 
the object for the next surface, the (i+ 1)st in this case. A surface-by-surface 
application of Eq. (2.2.5), starting with the first surface, will be illustrated in 
examples to follow. 

Equation (2.2.5) does not contain height y and hence applies to any ray passing 
through B before refraction, provided of course the paraxial approximation is 
valid. This equation also applies to object and image points that are not on the 
optical axis, provided these points are close to B and B’ and lie on a line passing 
through point C. This is illustrated in Fig. 2.2, where Q and Q’ denote an object 
and image point, respectively, for a case where B and B’ lie on opposite sides of 
the surface vertex. In Fig. 2.2 the line QCQ’ can be thought of as a new axis of 
the spherical surface, where Q and Q’ are conjugate points along the new axis just 
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Fig. 2.2. Conjugate points in the paraxial region. Here B and B’, Q and Q are pairs of conjugate 
points. See Eq. (2.2.7) for definition of transverse magnification. 


as B and B’ are conjugate points on the original axis. If the angle @ in Fig. 2.2 is 
small, then the line segments BQ and B’Q’ can be taken perpendicular to the 
original axis. In general, of course, BQ and B’Q’ are short arcs of circles whose 
centers are at C. 


2.2.6. MAGNIFICATION 


The geometry in Fig. 2.2 can be used to determine the transverse or lateral 
magnification m, defined as the ratio of image height to object height. In symbols 
we have m = h'/h, where 


kK =—(s'—R)tand,  h=—(s—R)tand, (2.2.6) 


and the sign convention has been applied to each quantity. Note that the paraxial 
approximation has not been applied in Eq. (2.2.6) in order to emphasize the fact 
that for this definition the object and image lie in planes perpendicular to the axis. 

In Fig. 2.2 we have s’ and R>0 and s and @ <0, hence h and h’ have 
opposite signs. Therefore 


kh s'—-R_ ns’ 
naz = —, 
h s—R ns 





(2.2.7) 


where the final step follows by substitution of Eq. (2.2.4). Because A and h’ have 
opposite signs in Fig. 2.2, the transverse magnification is negative for the case 
shown. If m < 0, as in Fig. 2.2, the image is inverted relative to the object; in the 
case where m > 0 the image is said to be erect. 

In Fig. 2.3 a ray joining conjugate points B and B’ has slope angles u and w. 
The angular magnification M is defined as tan u’/ tan u, where from the geometry 
of Fig. 2.3 we see that y = s tanu = s' tan u’. Therefore 


tany s n nh 
Stanu S wm nhl ER 
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Fig. 2.3. Angular magnification. See Eq. (2.2.8) for definition. 


Equation (2.2.8) relates the transverse and angular magnification for a pair of 
conjugate planes. Rewriting this relation we get 


nhtanu = n'h tanv, (2.2.9) 


which in the paraxial approximation becomes 
nhu =n h'v. (2.2.10) 


If, as is customary, we let H = nh tanu, then Eq. (2.2.9) states that H before 
refraction is the same as H after refraction. Thus in any optical system containing 
any number of refracting (or reflecting) surfaces, H is an invariant. This follows 
because the combination n'h'u’ for the first surface is nhu for the second surface, 
and so on. Called the Lagrange invariant, H is important in at least one other 
respect; the total flux collected by an optical system from a uniformly radiating 
source of light is proportional to H?. Its invariance through an optical system is 
thus a consequence of conservation of energy. 


2.3. PARAXIAL EQUATION FOR REFLECTION 


With the aid of Fig. 2.4 we now find the Gaussian equation for a reflecting 
surface in the paraxial approximation. Applying the sign conventions to the 
symbols shown gives distances s, s’, and R, and angles i, $, u, and u’ as negative. 
The law of reflection is i = —i’, hence the angle of reflection i’ is positive in Fig. 
2.4. From the geometry shown we get 


‘ i Y yY ed 
i=@ -u, i= h-u, pam ae use. 
Substituting into the law of reflection, i = —i’, gives 
1 1 2 
—=+-=>5.- 2.3.1 
tR (2.3.1) 


As in the case of Eq. (2.2.4), this relation applies generally to any object position 
provided we use the appropriate signs for the distances. At this point it is 
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Fig. 2.4. Reflection at spherical surface. Here B and B’ are conjugate axial points. 


important to point out that the law of reflection follows directly from Snell’s law 
of refraction if we make the substitution n’ = —n. Specifically, note that this 
substitution into Eq. (2.2.4) gives Eq. (2.3.1) directly. The fact that the relations 
for reflecting surfaces are thus directly obtained is very useful because we need 
only consider relations for refracting surfaces and simply put n’ = —n as needed. 
As an example we apply this substitution to Eqs. (2.2.5) and (2.2.7) and get 


l1 1 2 P 1 1 
Doaa. (2.3.2) 
AESA (2.3.3) 
AY 


Using Eq. (2.3.2) it is easy to verify that P > 0 for a concave mirror and P < 0 
for a convex mirror, where a mirror is concave or convex as seen from the 
direction of the incident light. Note, however, that the focal length of a concave 
mirror changes sign when the direction of the incident light is reversed. This is 
expected because the reversal of Fig. 2.4, left for right, changes the signs of s and 
s’. But because n also changes sign in this reversal, P is invariant. 

The meaning of a negative index of refraction simply means that the light is 
traveling in the direction of the -z-axis, or from right to left. Consistent use of this 
convention, together with the other sign conventions in Eq. (2.2.2), allows one to 
work with any set of refracting and/or reflecting surfaces in combination. 

In many situations it is convenient to take f > 0 for a concave mirror and 
f < 0 for a convex mirror, independent of the direction of the incident light. We 
will adopt this convention for convenience, keeping in mind that it violates the 
strict sign convention. The sign convention for s, s’, and R will always be 
observed. 
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2.4. TWO-SURFACE REFRACTING ELEMENTS 


We now apply the results of Section 2.2 to several systems with two refracting 
surfaces, a thick lens, a thin lens, and a thick plane-parallel plate. We first 
consider a thick lens, a lens in which the second refracting surface is distance d to 
the right of the first surface. 


2.4.a. THICK LENS 


A schematic cross-section of a thick lens is shown in Fig. 2.5. If we assume the 
lens has index n and is located in air, then n; =n, = 1, and ni =m =n. 
Applying Eq. (2.2.5) to each surface gives 


n 1 n-li l n l~n 
LE AE —-—= = P,, 2.4.1 
SY R; ! sS S R, z ( ) 





where s, = s} — d. 

With this system we find only the net power P or, equivalently, the effective 
focal length f’, where P = 1/f’. Figure 2.5 shows a ray with s} = œœ intersecting 
the first surface at height y) and the second surface at height y. From similar 
triangles in Fig. 2.5 we get 


< = — = £, (2.4.2) 





Fig. 2.5. Cross section of thick lens. See Eq. (2.4.3) for lens power. In the thin lens limit, 
f =s =5\. 
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We can now find the effective focal length by setting s; = œ and s, = s| — d in 
Eq. (2.4.1) and combining the result with Eq. (2.4.2). After a bit of algebra we get 


1 n 
P, =— P, = — — —— 
l 7? 2 7 1 , 
si s sid 


1 1 /s, —d n si —d 
Piaf aS ea B |S Pp l . 
F A s ) (aa al si ) 


Multiplying out the preceding equation we finally get the result sought in the 
form 








1 d 


In the steps leading to Eq. (2.4.3), both n and d are positive. If the directions of 
the arrows in Fig. 2.5 are reversed, the foregoing derivation reproduces Eq. 
(2.4.3), with P, and P, exchanging roles. In this case both d and n change sign 
and the ratio (d/n) is unchanged in sign. Thus P in Eq. (2.4.3) is the same for 
either direction of light. Note that the effective focal length f’ in Fig. 2.5 is 
measured from the intersection of two extended rays, the incident ray to the right 
and the refracted ray to the left. 


2.4.b. THIN LENS 


A thin lens is defined as one in which the separation of the two surfaces is 
negligible compared to other axial distances, that is, s, = s} effectively. For a thin 
lens in air, Eqs. (2.4.1) apply directly. Letting s; = s and s, = s’, the addition of 
these equations gives 


1 1 1 1 1 

50 v(z i) a Pita Pai = F (2.4.4) 
The net power of a thin lens is simply the reciprocal of its focal length and is the 
same as that of a thick lens with d = 0, as expected. Although a thin lens has two 
surfaces, it is of interest to note that the Gaussian relations that describe the lens 
are actually somewhat simpler than those for a single refracting surface. 

The transverse magnification of each surface is given by Eq. (2.2.7) with the 
results m; = s,/ns, and m, = ns, /s2. The net transverse magnification of a thin 
lens is then m = mm, = s'/s. 

As a final item for thin lenses, we note that Eq. (2.4.3) also applies to two thin 
lenses separated by distance d, where n = 1 in the space between the lenses. The 
simple analysis showing this is left to the reader. 





s! 
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2.4.c. THICK PLANE-PARALLEL PLATE 


A thick plane-parallel plate, as shown in Fig. 2.6, has a zero power but also has 
an image that is displaced laterally along the optical axis relative to the object. 
Applying Eq. (2.2.5) at each surface gives nj/s, =1,/s, and n}/s, = m/s2. 
Assuming the plate of index n is in air, n) = n, = 1, n =m =n, and noting 
that s, = s} — d, we get s| =ns,, Ss) = 5s, —(d/n). The distance from object to 
image is A = s} — s + d, or 


A =d{1—(1/n)]. (2.4.5) 


Note that the displacement A is independent of the object distance and, as is true 
in all cases in the paraxial approximation, independent of height y. For a typical 
glass with n & 1.5, we see that A =~ d/3. 

In the paraxial approximation an optical system is free of any aberrations, that 
is, an object point is imaged precisely into an image point. When the exact form 
of Snell’s law is used however, most systems will have some form of aberration. 
A thick plate is a good example of a simple system with aberration, that is, it fails 
to take all rays from a single object point into a single image point. This is easily 
shown by applying Snell’s law in its exact form at each surface. With the 
intermediate steps left to the reader, the geometry of Fig. 2.6 leads to 








a=a(1- i), (2.4.6) 
ncosi) 
A comparison of Eqs. (2.4.5) and (2.4.6) gives 
d cosi yed(n? — 1) 
Aaa — Apar = —(1 -—+) 22 ; 2.4.7 
exact par n ( cos 2) 2s?n? ( ) 
n=l n/=Nnp=n No=l 





Fig. 2.6. Image shift A for plane-parallel plate of thickness d and index n in air. See Eqs. (2.4.5)— 
(2.4.7) for discussion. 
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hence the image position depends on the ray height at the first surface. We 
consider the aberrations of a thick plate in more detail later. 


2.5. TWO-MIRROR TELESCOPES 


We now apply the results of the preceding sections to the general class of two- 
mirror systems. In this section we are concerned only with the paraxial properties 








(b) 


Fig. 2.7. Schematic diagrams of two-mirror reflecting telescopes: (a) Cassegrain; (b) Gregorian. 
Designated parameters are yı and >, height of ray at margin of primary and secondary, respectively; D, 
telescope diameter = 2|y, |; 2ly2|, diameter of secondary mirror; R; and R,, vertex radius of curvature 
of primary and secondary mirror, respectively; s) and sh, object and image distance of intermediate 
object (located at focal point of primary) measured from the secondary mirror vertex; fi, focal length 
of primary mirror; and d, distance from primary to secondary, d < 0. See Table 2.1 for definitions of 
normalized parameters. 
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of such systems, and will limit our discussion to the case where s; = œo. Two 
examples of particular two-mirror systems are shown in Fig. 2.7, the so-called 
Cassegrain and Gregorian types, of which the Cassegrain is the more common 
type for an optical telescope. 

Symbols in Fig. 2.7 are defined in the legend. Note that subscript 1 refers to 
the first mirror (primary) and 2 refers to the second mirror (secondary) in the 
optical train. For convenience, the sign of fı is taken positive when the primary is 
concave. Following the sign convention, y} and y, have the same signs for a 
Cassegrain and opposite signs for a Gregorian. 


2.5.a. NORMALIZED PARAMETERS 


It is very helpful to describe any two-mirror system in terms of a set of 
dimensionless or normalized parameters, defined as given in Table 2.1. Among 
the things to note for the entries in Table 2.1 are: (1) the focal ratios are defined as 
positive quantities; and (2) the dimensionless parameters do not change sign 
when the diagrams in Fig. 2.7 are reversed left for right. 

In Fig. 2.7 we have 8 > 0 when the focal point lies outside the space between 
the primary and secondary. Depending on whether the system is Cassegrain or 
Gregorian, the signs of some of these dimensionless parameters differ. In 
particular, k and m are each positive for a Cassegrain and negative for a 
Gregorian, hence the product mk is positive for each of the telescopes shown 
in Fig. 2.7. 

The relationships between these parameters are obtained with the aid of Eqs. 
(2.3.1) and (2.3.3) applied to the secondary, and the relation s) = kR,/2. The 
steps are as follows: 


122 2/1 Ni feney 
s R, kR Ri \e k) 3 \ p) m 


Table 2.1 


Normalized Parameters for Two-Mirror Telescopes 


k = y2/y, = ratio of ray heights at mirror margins, 
p = R,/R, = ratio of mirror radii of curvature, 
m = —S5/s, = transverse magnification of secondary, 
J, B = Dn = back focal distance, or distance from vertex of primary mirror to final focal point, 
B and ņq, back focal distance in units of f, and D, respectively, 
F, =\|f,|/D = primary mirror focal ratio, 
W = (1 — k)fi = distance from secondary to primary mirror, 
= location of telescope entrance pupil relative to the secondary when the primary mirror is the 
aperture stop, 
F =| f|/D = system focal ratio, where f is the telescope focal length. 
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Solving for m, and for p and & in turn, we get 


p mk p(m — 1) 
= a po A 2.5.1 
4 p—k P m—1 m ( o 
We also find 
1+8=km+!), n = F;ß. (2.5.1b) 


It should be kept in mind that the relations in Eq. (2.5.1a,b) apply specifically to 
the case where the original object is at infinity. Given this caveat we will see that it 
is convenient to describe telescope characteristics in term of these parameters, 
especially system aberrations. 

The net power of a two-mirror telescope is found by using Eq. (2.4.3), which 
can be rewritten as 


P, d 
P=P, (1 -5 (E)r) 
From Eq. (2.3.2) we find P; = —2/R,, P) = 2/R,, hence P,/P,; = —1/p. In 
using Eq. (2.3.2) note that n = 1 for the primary and n = —1 for the secondary, 
according to the sign convention. For the arrangements shown in Fig. 2.7, both d 
and n are negative; the light is traveling from right to left and the secondary 
mirror is to the left of the primary. Hence d/n is positive. In terms of the 
dimensionless parameters from Eq. (2.5.1), we find that d/n = (1 — &)P,, and 


P = P{l —(k/p)| = P, /m, (2.5.2) 


hence the telescope power is positive for a Cassegrain telescope and negative for 
a Gregorian. In accord with our convention for single mirrors, we take telescope 
focal length positive for a Cassegrain and negative for a Gregorian. In terms of 
the focal lengths and focal ratios, therefore, 


m=f/f, F=\|mlF;. (2.5.3) 


The difference in sign between the focal lengths of a Cassegrain and Gregorian, 
and their magnifications, requires some discussion. 

Consider the rays reflected from the secondary in Fig. 2.7(a). If these rays are 
extended to the left until they intersect their corresponding incident rays, the 
distance between the intersection plane and the focal point, measured along the 
axis, is the focal length. The focal point lies to the right of the intersecting rays, 
hence the focal length is positive. This is similar to the situation shown in Fig. 2.5 
for a thick lens. Following the same procedure for the Gregorian in Fig. 2.7(b), 
the incident rays and the rays reflected from the secondary must be extended to 
the right to locate the intersection plane, hence the focal length is negative. 

As for the magnification, its sign according to our convention is positive if the 
image made by the secondary has the same orientation as the object for the 
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secondary. This is the case for the secondary mirror in a Cassegrain telescope as 
shown in Fig 2.7(a). But this telescope has a final image that is inverted with 
respect to the original object on the sky. This is because the image given by the 
primary is inverted, hence the final image is also inverted. For the Gregorian 
telescope the final image is erect relative to the object on the sky because each 
mirror inverts its object. Thus these two types of telescopes might simply be 
characterized as follows: 


Cassegrain: f > 0, m > 0, final image inverted; 
Gregorian: f <0, m <0, final image erect. 


2.5.b, OTHER TELESCOPE CHARACTERISTICS 


Given our introduction of normalized parameters for two-mirror telescopes, it 
is appropriate to discuss other characteristics in terms of them. Among these are 
telescope scale, effect of secondary mirror displacement on focal surface location, 
secondary mirror to focal point separation, diameter of secondary mirror as a 
function of field size, and overall telescope length. 

We are limiting our discussion here to optical systems for which the original 
object is effectively at an infinite distance, hence it is not possible to give a useful 
formula for the magnification of the system. Rather it is the telescope scale that 
provides a useful parameter of the telescope. For a telescope of focal length f, the 
scale is 


206265 
f(mm) ’ 


where the units of arc-sec/mm are those most often used. For conversion to radian 
measure the identities 0.206 arc-sec = | urad and 3.44 arc-min = 1 mrad can be 
used. Equation (2.5.4) applies to a telescope with any number of mirrors. 

For a given pair of primary and secondary mirrors the location of the telescope 
focal surface depends on the location of the secondary, as given by Eq. (2.5.1). If 
the secondary is moved along the optical axis, then both m and k are changed, and 
so also is the position of the focal surface. 

Let ds, be the displacement of the focal surface, or focal surface shift, when 
the secondary is displaced by ds. Differentiating Eq. (2.3.1) while holding R, 
constant, we find that 





S(arc-sec/mm) = (2.5.4) 


ds, = —m’ds), (2.5.5) 


where ds, < 0 when the secondary is moved closer to the primary. 
The displacement given by Eq. (2.5.5) is measured relative to the secondary, 
which is now in a new position. Relative to the object at the focal point of the 
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primary, or relative to the primary, the focal surface has moved by 
ds, — ds, = —(m* + 1)ds,. This relation can easily be checked by working 
with the dimensionless parameters. From Eq. (2.5.1) we have 


= ugf2peae 
B=k(m+1) 1=#(-2=*) l, 
dd. P 


dk ~ (p—ky 





(2.5.6) 
+l=m +1, 





where dk = —ds,/f,, and dB = shift of focal surface relative to the primary 
mirror vertex in units of f). 

Although Eq. (2.5.6) does not set any apparent limit on how far the secondary 
can be moved, there is a limit set by the onset of aberrations. Two-mirror 
telescopes generally have a mirror separation set to make the on-axis aberration 
zero. For a different secondary position the on-axis aberration is no longer zero, 
and its size sets a practical limit to the amount of secondary displacement. These 
limits will be considered during our discussion of telescope aberrations, 

Both Cassegrain and Gregorian telescopes, especially the former, can have a 
long overall focal length in a mechanical structure that is many times shorter. 
From Fig. 2.7 we see that a typical Cassegrain telescope has a secondary mirror to 
focal surface separation comparable to fi, or about m times smaller than f. More 
precisely, the secondary mirror-focal surface distance is fi(l +f —k), or 
1 +£ —k in units of fi. Using Eq. (2.5.1) we get 


secondary-focal surface distance = mkf), (2.5.7) 


a relation that applies to both Cassegrain and Gregorian types. We show in 
Chapter 6 that a Gregorian telescope is significantly longer than a Cassegrain if 
both telescopes have the same values of |m| and fi, hence the same focal length. 
The advantages of a relatively short structure from an engineering point of view 
are obvious because there will be less flexure in a short telescope than in a long 
one. 

Another difference between these two types of telescopes is the size of the 
secondary required to accept all of the light reflected from the primary. Each 
diagram in Fig. 2.7 shows a secondary mirror whose diameter is |k|D, the 
minimum required for a single point source. To cover a field on the sky of angular 
diameter 20 without vignetting any light from the primary, the secondary must be 
larger by 20(1 — k) fi = 20F,(1 — k)D. Thus the full diameter of the secondary is 


D, = D{|k| + 20F,(1 — W). (2.5.8) 


Because k < 0 for a Gregorian, the diameter D of the Gregorian secondary is 
larger for the same @ and F,, hence it blocks a larger fraction of the light headed 
for the primary. 
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These two-mirror designs have the added feature that the system focal length is 
easily changed simply by putting in a different secondary mirror, without 
changing the physical length by a large factor. As an example, consider a 
Cassegrain telescope with parameters F, = 3, $ =0.25, and m=3. Using 
Eqs. (2.5.1) we find k = 0.3125 and the normalized secondary muirror-focal 
surface separation is 0.9375. If we choose to increase the telescope focal length 
by a factor of three, hence making m=9, while keeping $ = 0.25, then 
k =0.125 and mk = 1.125. The modified telescope is only 1.2 times longer 
than the original one. 

A final, and very significant, advantage of two-mirror systems is the additional 
freedom provided for controlling image quality. With proper choices of surface 
parameters it is possible to have the aberrations of the primary canceled, entirely 
or in part, by those of the secondary, thus giving a system with better image 
quality. We discuss these considerations in detail in subsequent chapters. 


2.6. STOPS AND PUPILS 


We now turn our attention to the important topic of stops and pupils. Our 
discussion, although brief, will cover the essential points. For a more complete 
discussion the reader should consult any of the intermediate-level texts listed in 
the bibliography at the end of the chapter. 


2.6.a. DEFINITIONS AND BASICS 


The aperture stop is an element of an optical system that determines the 
amount of light reaching the image. This stop is often the boundary of a lens or 
mirror, although it may be a separate diaphragm. In addition to controlling the 
amount of light entering the system, it also is one of the determining factors in the 
sizes of system aberrations. For most telescopes the primary mirror serves as the 
aperture stop, although in many infrared telescopes the secondary mirror is the 
aperture stop. 

The field stop is an element that determines the angular size of the object field 
that is imaged by the system. In most systems the boundary of the field stop is the 
edge of the detector, although it may also be a separate diaphragm in an image 
plane ahead of the detector. 

In a general optical system the image of the aperture stop formed by that part 
of the system preceding it in the optical train is called the entrance pupil. For two- 
mirror telescopes in which the primary mirror is the aperture stop, as well as for 
prime focus (single mirror) and refracting telescopes, no imaging elements 
precede the aperture stop. In this case the entrance pupil coincides with the 
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aperture stop. For infrared telescopes the aperture stop (secondary mirror) is 
preceded by the primary mirror. In this case the entrance pupil is the same 
diameter as the primary mirror, an exercise left for the reader. 

The image of the aperture stop formed by that part of the system following it is 
called the exit pupil. The significance of the exit pupil is that rays from the 
boundary of the aperture stop approach the final image point as if coming from 
the boundary of the exit pupil, for all incidence angles at the aperture stop 
boundary. If the secondary mirror is the aperture stop, then there are no telescope 
optics following the aperture stop and the telescope exit pupil coincides with the 
stop. 


2.6.b. PUPILS FOR TWO-MIRROR TELESCOPES 


We now apply these definitions to telescopes of the type shown in Fig. 2.7. 
Taking the aperture stop at the primary, at distance W = (1 — k) fi from the 
secondary, the exit pupil is the image of the primary formed by the secondary. 
Figure 2.8 shows the exit pupil location for a Cassegrain telescope; for a 
Gregorian the exit pupil is located between the primary and secondary mirrors. 

Applying Eq. (2.3.1) to the geometry in Fig. 2.8, with fiô defined as the 
distance from the exit pupil to the telescope focal point, and converting to 
normalized parameters, gives 


mk m(1+ p) 


= = : 2.6.1 
m+k—1 m + B ( ) 





where 6 > 0 when the focal surface of the system lies to the right of the exit 
pupil, as shown in Fig. 2.8. Although Eq. (2.6.1) was derived from the diagram 


exit aperture 
pupil stop 








Fig. 2.8. Location of exit pupil for Cassegrain telescope. The exit pupil is closer to the secondary 
than is the primary focal point. See Eq. (2.6.1). 
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for a Cassegrain, it also applies to a Gregorian telescope. The distance from the 
secondary mirror to the exit pupil, in normalized parameters, is mk — 6. From 
Eqs. (2.5.7) and (2.6.1) we find 


: ae k(k-1 
secondary-exit pupil distance = Caa, fi (2.6.2) 
m+k—1 
Using Eqs. (2.3.3) and (2.2.3) we find that the exit pupil diameter is 
Da = D\6/m| = f\|6/F|. (2.6.3) 


Because the centers of the aperture stop and exit pupil are on the axis of the 
telescope, the so-called chief ray appears to come from the center of the exit pupil 
after reflection from the secondary. The chief ray is defined as the ray that passes 
through the center of the aperture stop. If the angle of incidence of the chief ray at 
the primary is 0, its angle with respect to the telescope axis is y after reflection 
from the secondary. The relation between these angles is easily derived from the 
geometry shown in Fig. 2.9, where the focal length of a thin-lens refracting 
telescope equivalent to a Cassegrain type is f. From Fig. 2.9 


Wf,o =f0 = mf,9, (2.6.4) 


hence w/@ = m/6. Because ô is generally of order unity, the chief ray angle at the 
focal surface is of order m larger than the incident chief ray angle. 

If the secondary mirror is the aperture stop, then the exit pupil coincides with 
the stop. In this case 6 = mk, and w/@ = 1/k, or again of order m because mk is 
usually of order unity in size. 


2.6.c. EXAMPLES OF PUPILS 


The importance of stops and pupils is especially evident when auxiliary optics 
following the telescope are used to improve overall image quality. In both of the 
examples discussed here, one or more optical elements reimages the exit pupil of 


L 
EP FP 


jæ fiò 
iii ae 
EY EEE 


Fig. 2.9. Relation between incident and final chief ray angles, 0 and y, respectively, in two-mirror 
telescope. Here L is the lens of equivalent refractor, EP the exit pupil, FP the focal plane. See Eq. 
(2.6.4). 
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the telescope on to an optical element whose main function is improvement of the 
image quality. Generally there are additional optical requirements for these 
optical elements, but these are not relevant to our discussion of pupils. 

The most dramatic example of the improvement of image quality was the 
“fix” of the spherical aberration (SA) present in the images produced by the 
Hubble Space Telescope (HST) when it was launched in 1990. We will discuss 
this aberration and the nature of the optical fix in detail in subsequent chapters; at 
this stage we consider only the role played by pupils in the fix. 

The SA present in the HST images was attributed to a primary mirror that had 
been incorrectly figured. Although the mirror is of superb quality, its shape is less 
curved than that of the optical prescription, with the maximum difference of 
about 2u at the edge of the mirror. The approach adopted to compensate for this 
error was to place a pair of mirrors (we will call them M1 and M2) into the 
converging beam near the telescope focus and to make M2 with a corresponding 
difference, but more curved rather than less. Each point on mirror M2 must be in 
one-to-one correspondence with a point on the primary, hence must be located at 
a pupil. The purpose of mirror M1 is to reimage the exit pupil of the telescope on 
to M2, that is, the exit pupil of the HST is the object for M1 with the image 
placed on M2. 

Another example showing the importance of pupils occurs in the case of 
adaptive optics, the compensation in realtime of the degrading effects of the 
Earth’s atmosphere on starlight passing through it. (A discussion of the principles 
of adaptive optics follows in later chapters.) At this point we simply point out that 
the light reaching the primary mirror of a ground-based telescope is distorted by 
the atmosphere in a random way on a timescale of milliseconds. Although this 
mirror may be capable of producing a near-perfect image, to the remaining optics 
in the telescope it is as if the light from the primary is coming from a “rubber” 
mirror with everchanging shape on a small scale. The “fix” in this case is 
auxiliary optics that must reimage the telescope exit pupil on to a flexible mirror, 
sense and measure the distortion in the incoming light, and transmit the distortion 
to the flexible mirror in a reversed form to effect compensation. 

These two examples are really quite similar. In both cases the exit pupil is 
reimaged on to a mirror that compensates for a distortion preceding it in the 
optical train. The major difference is that the correction is static in one case and 
dynamic in the other. 


2.7. CONCLUDING REMARKS 


The material in this chapter, based as it is on paraxial optics, is only an 
introduction to a much larger subject area. We have included topics deemed 


26 2. Preliminaries: Definitions and Paraxial Optics 


essential for further discussion of telescopes and auxiliary instruments used with 
them, but left out topics such as principal planes, nodal points, and the methods 
of ray tracing. Any of the intermediate texts listed in the following bibliography 
should be consulted for an all-inclusive look into ray optics. A thorough 
presentation of the exact tracing of rays through an optical system is given by 
Welford (1986). 

Our discussion of two-mirror telescopes is also only a beginning into an 
analysis of telescopes generally. We limited our presentation to an introduction of 
normalized parameters and their utility in describing the properties of two broad 
classes of two-mirror telescopes. In the following chapters we will go into much 
more detail, especially on aberrations and image characteristics of many types of 
telescopes within these classes, as well as for other types. Thorough discussions 
of the properties of telescopes are given by Wilson (1996) at an advanced level 
and by Rutten and van Venrooij (1988) at an intermediate level. 
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Chapter 3 Fermat’s Principle: An Introduction 


A very powerful method in dealing with geometrical optics, the analysis of 
optical systems by tracing rays, is a principle ascribed to Fermat. For a single 
plane reflecting or refracting surface it states that the actual path that a light ray 
follows, from one point to another via the surface, is one for which the time 
required is a minimum. For this particular case, Fermat’s Principle can be called 
the principle of least time. 

Although the principle as stated here is correct for a single surface, it must be 
modified for application to a general optical system. In its modern form Fermat’s 
Principle states that the actual path that a ray follows is such that the time of travel 
between two fixed points has a stationary value with respect to small changes of 
that path. In other words, the path of a ray from one point to another is such that 
the time taken has no more than an infinitesimal difference of second order from 
the time taken in traveling along other closely adjacent paths between the same 
points. Hence, to a first approximation, the travel time of the actual ray is equal to 
that along a closely adjacent path. 

We first look at some of the consequences of this statement from a general 
point of view. The discussion involving calculus of variations can be skipped on a 
first reading, though results derived for the atmosphere are important for 
observations with ground-based telescopes. In subsequent sections we look at a 
number of other specifics that follow from this principle. 
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3.1. FERMAT’S PRINCIPLE IN GENERAL 


The simplest case illustrating Fermat’s Principle is shown in Fig. 3.1. A surface 
È lies between two points, Py) and P}, with a ray joining these points consisting of 
straight line segments. The solid line is the actual ray path and the dashed line 
some other path. If the time of travel from Pp to P, is denoted by 1, then the 
condition that t have a stationary value for the actual path is 


at/ax = dt/ay = 0, (3.1.1) 


where x, y are the generalized coordinates of the point where the ray intersects the 
surface. 

An equivalent statement of Fermat’s Principle is obtained by replacing the 
words time of travel with optical path length. If dt is an infinitesimal time of 
travel, then cdt is the corresponding optical path length, where c is the velocity of 
light in vacuum. The optical path length (hereafter denoted by OPL) is expressed 
in terms of the geometrical path length and index of refraction as follows: 


d(OPL) = c dt = (c/v)v dt =n ds, 


(3.1.2) 
OPL = cfar = f ds, 


where v is the speed of light in the medium of index n. The general statement of 
Fermat’s Principle is either ôt = 0 or 6(OPL) = 0, where n can be a function of 
all the coordinates that specify the position. 

We now consider the two-dimensional (2D) case where the index of refraction 
n = n(y, z) and ds = ydy? + dz. Letting y’ = dy/dz, Fermat’s Principle gives 


Pi 
ô | nly, z) (1 + y?)dz = 0, (3.1.3) 





Fig. 3.1. Possible ray paths through interface between different optical media. 
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where ds has been replaced by dz,/(1 + y”). Letting F(y,y’,z) represent the 
integrand in Eq. (3.1.3) we get 





P; Pi 
af F(y, y’, z)dz = OF (y, y’, z)dz = 0, (3.1.4) 
Po Po 
where 
oF oF oF OF d 
ôF = ô by’ = ô — (dy). 
iy yty” dy styg 


Substituting for ôF in Eq. (3.1.4) and integrating the term containing y’ by parts, 
we get 





P 
‘OF F 
| Lily i ait 


Pi Pid [IF 
ô ôy dz = 0. 3.1.5 
p, oy ay’ | ( Jay ý G 


Po p, dz \ ay’ 





The second term in Eq. (3.1.5) is zero because dy is zero at the endpoints. 
Therefore we can write Eq. (3.1.5) as 


Pilar d N] 
— ——(— | ly dz =0. 
LG Ar 4 


This expression must vanish for an arbitrary dy and therefore 


AF d (aF 
=z) ~0, (3.1.6) 


which is the equation required to satisfy Fermat’s Principle. 

We now take Eq. (3.1.6), replace F with the expression it represents, and carry 
out the differentiations indicated. As noted following Eq. (3.1.3) we have 
F=n(y,z)/0 +y”). Noting that y’ is not an explicit function of y, nor is n a 
function of y’, we get 
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Although Eq. (3.1.7) is a rather formidable equation in appearance, it is easily 
simplified after making some trigonometric substitutions. Figure 3.2 shows a 
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Fig. 3.2. Small segment of curved ray path in inhomogeneous medium. Dashed line is tangent to 
ray at point P. 


segment of the ray path with the dashed line tangent to the path at point P At this 
point 


niga Pr ee 
odo’ ds JO +y 3.1.8 
aa = : asoa. or 
ds /(1+y?)’ dz = dz” 
Using Eqs. (3.1.8) and noting that 
dn = on +y on 
dz az” ay’ 
we write Eq. (3.1.7) as 
ðn . ðn da 
t at ee (3.1.9) 
As a final item we note that the curvature « of a path in space is defined as 
putt Mt osati 
~ ds dzds dz` 
Substitution of this result into Eq. (3.1.9) gives 
da an on 
= = ——sina—. 3.1.10 
nk = COSA ga sinas ( ) 


The result in Eq. (3.1.10) gives the local curvature of a light ray subject to 
Fermat’s Principle in a medium in which the index of refraction is a smoothly 
varying function of position. Note that this relation applies to a ray in the yz-plane 
with n = n(y, Z). 

As a special case of Eq. (3.1.10), assume that index n is constant. In this case 
the partial derivatives on the right side of Eq. (3.1.10) are zero and hence the 
curvature is zero. Thus the path of a light ray in a homogeneous medium is a 
straight line. 
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We now apply these results in turn to optical surfaces separating homogeneous 
media, such as air and glass, and to one inhomogeneous medium, the Earth’s 
atmosphere. 


3.2. FERMAT’S PRINCIPLE AND REFRACTING SURFACES 


In this section we consider several examples of refracting surfaces and 
approach them from the point of view of Fermat’s Principle. In doing this we 
will rederive some of the results of Chapter 2 as well as find some new ones. For 
all of the cases discussed we assume that homogeneous media are separated by a 
surface across which the index changes abruptly. 


3.2.a. LAWS OF REFRACTION AND REFLECTION 
Fermat’s Principle can be used to derive Snell’s law at a plane interface where 


the index changes from n to n’, as shown in Fig. 3.3. For this situation the 
condition that the path is stationary is, from Eq. (3.1.2), given by 


Po P, 
ô n| ds+n' | ds| =0, 
P; Po 


which, upon evaluation of the integrals, gives 





afna tD +n ito =x) If =0. (3.2.1) 





Fig. 3.3. Ray through plane interface between two homogeneous media with different indices of 
refraction. 
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This is, as expected, simply the sum of two optical lengths. Our variable is yọ and 
differentiating Eq. (3.2.1) gives 


d id 
{ng leh to) + We [z2 + 0 -yon ST 


The expression in braces in this relation is independent of dy, and therefore we 
set the expression equal to zero. Doing the differentiation gives 


n Yo —n' Yı — Yo = 
VG +y) [2 + 64 — yo)” 


Examination of Fig. 3.3 shows that the factors multiplying n and n’ are sini and 
sini’, respectively, and hence Eq. (3.2.2) is simply Snell’s law of refraction, 
nsini =n’'sini’. The law of reflection follows directly if we let n’ = —n, in 
which case we also have i’ = —i. 

The nature of this stationary condition can be examined further by differ- 
entiating Eq. (3.2.2) with respect to yọ and looking at the sign of the result. 
Because the sign is positive, the path taken by the ray in going from one fixed 
point to another is such that its time of travel or OPL is a minimum. 





(3.2.2) 


3.2.6. SPHERICAL INTERFACE 


Although we have already derived the paraxial equation for refraction at a 
spherical interface in Section 2.2, we will repeat the exercise using Fermat’s 
Principle. The spherical surface separating two homogeneous media, along with 
conjugate points B and B’, is shown in Fig. 3.4. With due regard for signs 
according to the Cartesian convention, the optical length L from B to B’ via point 
P is given by L = —nl + n'l’, where from the law of cosines 





[= -Jr + (R — s¥ — 2R(R — s) cos ¢, 





l= Je + (s’ — RY +2R(s’ — R) cos ¢. 





Fig. 3.4. Refraction at spherical interface. 
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Substituting / and J’ into L, we have an expression in which @ is the variable. We 
apply Fermat’s Principle and find the stationary condition by setting dL/d@ = 0. 
This gives 

dL nR(R —s)sing n'R(s'—R)sing _ 


Te i 7 0. (3.2.3) 





In the paraxial limit Z = s and l’ = s’. Substitution of these into Eq. (3.2.3) 
immediately leads to Eq. (2.2.5). 


3.2.c. FOCAL LENGTH OF THIN LENS 


As an example of a slightly more complex system, we use Fermat’s Principle 
to find the focal length of a thin lens of index n, with radii of curvature R, and R, 
as shown in Fig. 3.5. 

To find the focal length we make use of the fact that Fermat’s Principle must 
apply to every ray between two conjugate points of a focusing system. For 
example, in Fig. 3.4 we see that a ray from B to B’ along the z-axis must have an 
OPL that is stationary with respect to closely adjacent paths. But each of these 
adjacent paths is itself stationary, hence the OPL is the same along all paths 
between two conjugate points, at least to a first approximation, provided the rays 
pass through the system. Stated differently, the OPL (or time of travel) between 
two conjugates of a perfect focusing system is neither a minimum nor a maximum. 

Returning to the thin lens shown in Fig. 3.5, we find the OPL for each of two 
rays. For the ray coincident with the z-axis we get 


Ly = [BO] + n[0,0.] +f", 





Fig. 3.5. Cross-section of thin lens (not to scale). By sign convention z; > 0, z, < 0. 
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while for the ray at height y in the paraxial range 
Lp = [BO;] +2, + n[P, P] -Z + L, 


where z, < 0 and / is measured from the y, axis. Although the distance [BO] is 
infinite, this is of no consequence because on setting Lọ equal to L, this distance 
drops out, and we get 


nd+f' =z +n(d -z +23)- z +l. (3.2.4) 
In Eq. (3.2.4) we have substituted d = [0] 03], d — zi +2, = [P P2]. Re- 
arranging Eq. (3.2.4) leads to 
l- f' =(n—-1)(z,; — z2). (3.2.5) 
The radii of curvature, R; and R,, are given by 


R? =y} + (R; -2 =R} +y —2R;z, 
RB =y + (-R, +) = R} +y — 2R32, 


where y, = y, = y for a thin lens in the paraxial range. In this approximation we 
get zı = y’/2R, and z, =y*/2R. 

From Fig. 3.5 we see that Ż = y? + f° = f7(1+ °/f”). Taking the square 
root and using the binominal expansion gives / — f’ = y’/2f’. Taking these 
results, substituting for z,, z}, and /—/’ in Eq. (3.2.5), and canceling common 


factors gives 
l 1 l 
f Ri R 


a result already seen in Eq. (2.4.4). A similar approach can be used to find s and 
s’ in terms of f’, an exercise left to the reader. 


3.2.d. DISPERSING PRISM 


As our final example in this section we consider a glass prism as shown in Fig. 
3.6. Because n = n(A) the angle of deviation @ is also a function of A, where / is 
the wavelength of light. With rays incident as shown in Fig. 3.6, there is some 
wavelength whose rays in the prism follow paths parallel to the prism base. For 
these rays the diagram is symmetric about the vertical bisector of the prism, and 
hence s; = s, = 5S, @, = 9, = 9, and a, = @ =a. 

Applying Fermat’s Principle to this symmetric situation we get 


2L cos @ = nt, (3.2.6) 
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Fig. 3.6. Dispersing prism of base ¢ and opposite angle y. 


where the left side of Eq. (3.2.6) is the OPL of the upper ray and the right side is 
the OPL of the lower ray in Fig. 3.6. We are interested in seeing how @ changes 
with wavelength. Differentiating Eq. (3.2.6) with respect to wavelength gives 


dn d do dé 


tZ = 210 sin os = -2Lsing FO. (3.2.7) 


da 


From Fig. 3.6 we see that Lsing =a, 0 = n — y — 2g, from which we get 
dọ/d0 = —1/2. Substituting into Eq. (3.2.7) we get 


dé t\ dn 
a BH (3.2.8) 


where t/a is the ratio of the base length to the beam width. 
The index of refraction of most optical glasses can be expressed approximately 
in the form 


n(A) = A +(B/2’), (3.2.9) 


where A and B are constants. Differentiating Eq. (3.2.9) and combining with Eq. 


(3.2.8) we get 
d0 2 B 
da -(Z)(4): a 


The negative sign indicates that 0 decreases as 1 increases, hence blue light is 
deviated more than red light. We also note that d@/d1, angular dispersion, is 
numerically larger for shorter wavelengths. 
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3.3. WAVE INTERPRETATION OF FERMAT’S PRINCIPLE 


Fermat’s Principle is a statement about the behavior of light rays in terms of 
optical path length. The statement does not in any way make use of the fact that 
light is an electromagnetic wave capable of undergoing constructive and 
destructive interference. By treating light as a wave we can give a physical 
interpretation of Fermat’s Principle in terms of destructive interference of waves 
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Fig. 3.7. (a) Optical path difference for ray in Fig. 3.3. See text, Section 3.3, for defined 
coordinates. (b) Greatly magnified view near minimum in (a). 
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following different paths. This is most easily done by means of a specific 
example. 

If, in Fig. 3.3, we choose n = 1, n’ = ./5/2, then the stationary path is that 
for which P; is at (2, 2) when P} is at (0, —1) and Py is at (1, 0). The optical path 
length between P, and P, is then a minimum. In order for another wave 
originating at P, to reach P, half a cycle after the wave following the minimum 
path, we need Ayy of 1475 wavelengths for light of 500 nm, when the coordinates 
of the points are given in meters. One-half cycle difference for two waves 
corresponds to destructive interference. 

Stated in another way, the extra path length introduced when yọ is changed by 
1475 wavelengths (or 0.7375 mm) is only a half of a wavelength when the change 
is in the neighborhood of the stationary path. If, on the other hand, we choose Po 
at (1.2, 0), then the path from P, to P, is not a stationary one. In this case a half 
wavelength change in OPL is introduced when Ayo is about 2.8 wavelengths. The 
variation in OPL as a function of yọ is shown in Fig. 3.7a,b. 

Fermat’s Principle can therefore be thought of as giving the path through 
which the highest transmission of light is possible. This path is the one that 
presents to the light waves the largest area without significant destructive 
interference for waves that pass through that area. 


3.4. FERMAT’S PRINCIPLE AND REFLECTING SURFACES 


The application of Fermat’s Principle to a spherical refracting interface and 
thin lens in Section 3.2 gives results that apply in the paraxial domain. In these 
examples the surface shapes were specified (all spherical), with the result that the 
derived equations are strictly true only for paraxial rays. In this section on 
reflecting surfaces we adopt a different procedure and require that rays over the 
entire aperture satisfy Fermat’s Principle. We then find the appropriate surface 
shape needed to satisfy this requirement. 


3.4.a, CONCAVE MIRROR, ONE CONJUGATE AT INFINITY 


We first consider the concave mirror shown in Fig. 3.8. Parallel rays are 
incident from the left with all rays focused at a distance f from the mirror vertex. 
For convenience we let f, /, and A be positive quantities. Applying Fermat’s 
Principle to a ray on the optical axis and a ray at height y, we see that equal OPLs 
require 2f =1+(f—A), orl=f +A. 

From the geometry in Fig. 3.8 we see that 


Pa=y+(f—-Ay. (3.4.1) 


38 3. Fermat’s Principle: An Introduction 





Fig. 3.8. Rays from distant point source incident on concave reflector, where / is the distance from 
Po to B’. Image at B’ is point for surface given by Eq. (3.4.3). 


Eliminating / in Eq. (3.4.1) gives y? = 4fA, which in terms of z is 
y = 4f. (3.4.2) 


Equation (3.4.2) is the equation of a parabola whose vertex is at (0, 0). The 
paraboloid, or paraboloidal surface of revolution, is obtained by rotating the 
parabola about the z-axis; its equation is found by replacing y? by x? + y?. Using 
Eq. (2.3.2) we can express f in terms of R which, upon applying the sign 
convention to R, gives 


y? = 2Rz. (3.4.3) 


R is the radius of curvature at the mirror vertex, and both R and z are negative in 
Fig. 3.8. 


3.4.b. CONCAVE MIRROR, BOTH CONJUGATES FINITE 


Figure 3.9 show a concave mirror with an object point at B and the 
corresponding image point at B’, both on the z-axis. Here we adopt the sign 
convention for s and s’ at the outset, while choosing /, /’, and A as positive 
quantities. Given s and s’ < 0 in Fig. 3.9, the application of Fermat’s Principle to 
the two rays leaving B gives 


i+l’=—-(s+s’), 
P=y4(-s— Ay, I? =y es" AF. 
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Fig. 3.9. Rays between conjugates at finite distances via concave reflector, where /(/’) is the 
distance from Py to B(B’). Imagery is perfect for surface given by Eq. (3.4.4). 


Eliminating / and 1’ between these relations, and letting A = —z as in Eq. (3.4.2), 
leads to the relation 





ss! ss’ 
y — 4z +47 —— = 0. 3.4.4 
s+s' (+s em 


This is the equation for an ellipse with center (0, a), with a and b the semimajor 
and semiminor axes, respectively. We can easily put Eq. (3.4.4) into the standard 
form of an ellipse equation if we choose 2a = s+’, b? = ss’. The standard 
equation for an ellipse with center (0, a) is 





@-ay y _ 
a Pe I, 
which can be written as 
b? b 
y -2z +7? =, (3.4.5) 


The choice of a and b as given in the preceding follows directly from a 
comparison of Eqs. (3.4.4) and (3.4.5). It is not surprising that Fermat’s Principle 
leads to an ellipse as the appropriate curve with the two conjugate points at the 
foci of the ellipse, considering the standard technique for drawing an ellipse with 
pencil, string, and two pins. A rotation of the ellipse about the z-axis gives an 
ellipsoid, with the surface equation given by Eq. (3.4.5) after replacing y? by 
X +y. 

Note that the sphere is a special case of an ellipsoid in which s = s’ and a = b. 
Note also that the parabola given by Eq. (3.4.2) is a special case of Eq. (3.4.4) in 
which s = œ and s’ = —f. 
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3.4.c. CONVEX MIRROR, BOTH CONJUGATES FINITE 


Figure 3.10 shows a convex mirror with a virtual object point at B and the 
conjugate image point at B’, both on the z-axis. As with the ellipse, we adopt the 
sign convention for s and s’ but choose /, l’, and A as positive quantities. The 
dashed arc in Fig. 3.10 is a circular arc whose center is at B. Applying Fermat’s 
Principle to the two rays heading toward B gives 


I+ 1' =2s’, 
while the geometry of Fig. 3.10 gives 
a =y +(-s—A)’, lidae Ss I? =~ +(s’+ AY. 


Eliminating /, l’, and d between these relations, and putting A = —z, leads to 


ss’ SS 
4 42° = 0, 3.4.6 
y Paar as Gee (3.4.6) 





an equation identical to Eq. (3.4.4). There is, however, an important difference 
between Eq. (3.4.4) and Eq. (3.4.6). In the former equation s and s’ have the same 
sign because both conjugates are on the same side of the mirror vertex; in the 
latter equation s and s’ have opposite signs. As is easily demonstrated, Eq. (3.4.6) 
is the equation of a hyperbola. 

The standard equation for a hyperbola with a vertex at (0, 0) is 


-af y 





l, 


a? bo 





Fig. 3.10. Rays between conjugates at finite distances via convex reflector, Imagery is perfect for 
surface given by Eq. (3.4.6). 
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which can be rewritten as 
b? b 
yt+2z—-7-=0. (3.4.7) 
a a 


Equations (3.4.6) and (3.4.7) agree if we choose b? = —ss’, and 2a = s +’. As 
before, the replacement of y? by x? + y? gives a hyperboloid of revolution about 
the z-axis. 

The case of a convex mirror with one conjugate at infinity is left as an exercise 
for the reader. The appropriate surface for this situation is a paraboloid. 


3.5. CONIC SECTIONS 


Each of the surface cross sections derived in the preceding section is a conic 
section and it is therefore appropriate to find a single equation describing the 
family of such curves with the vertex at the origin. We proceed by working with 
Eq. (3.4.4) for an ellipse. From Eq. (2.3.1) we get 

ss’ R 
s+s 2 





(3.5.1) 


where this relation applies in the paraxial region, hence R is the vertex radius of 
curvature. For an ellipse the eccentricity e is defined as e = c/a, where c is the 
distance from one of the foci to the center of the ellipse and œ? = a* — b?. 
Substituting in terms of s and s’ we get 


2 





4 £ = là 
fsc Aan an (3.5.2) 
(s +s’) (s +s) 
Substituting Eqs. (3.5.1) and (3.5.2) into Eq. (3.4.4) gives 
y —2Rz + (1 -e)z =0. (3.5.3) 


Although derived from the ellipse equation, the relation in Eq. (3.5.3) describes 
the family of conic sections, provided we choose e appropriately. In the literature 
one often sees a conic section described in terms of a conic constant K, where 


K = —e. In terms of both e and K the various conic sections are as follows: 
oblate ellipsoid: 2 <0 K>0 
sphere: e=0 K=0 
prolate ellipsoid: 0<e<l -1<kK<0 
paraboloid: e=l1 K=-1 


hyperboloid: e>l K<-l 
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In all of the discussion to follow, we use K to describe the conic sections. 
Rewriting Eq. (3.5.2) in terms of the magnification m by substituting Eq. (2.3.3) 
into Eq. (3.5.2) gives 


__ (tly 


Saye (3.5.4) 


Transforming Eq. (3.5.3) to get the equation for the surface of revolution gives 
r —2Rz+ (1 +K)? =0, (3.5.5) 


where 7 =x +y. 
At this point it is instructive to calculate R}, the local radius of curvature at a 
point (r, z) on the mirror surface. The relation for radius of curvature is 


Re = (1+ 27/2", 


where z’ = dz/dr, 2" = d*z/dr?. Solving Eq. (3.5.5) for z and carrying out the 
calculation gives 


Re = {1 — KE / RPI 
= Rl — K(e?/16F?))”, (3.5.6) 


where F = | f|/D and r = eD/2, with 0 < e < 1. 

For K = 0 we get R,, = R, as expected. As we go through the family of conic 
surfaces from sphere to ellipsoid to paraboloid to hyperboloid, we see that R}, gets 
progressively larger for a given r and R. Alternatively the local curvature, 1/R,,, 
gets progressively smaller. As the point on the surface approaches the vertex, 
hence r — 0, we see that R —> R. Near the vertex all of the surfaces have nearly 
the same shape and, in the paraxial approximation, are identical. We will return to 
a further discussion of Eq. (3.5.6) and its ramifications in the fabrication of large 
mirrors in Chapter 18. 

In summary, then, we see that conic surfaces used as mirrors provide perfect 
imagery for a single pair of conjugates. A given conic mirror, however, will not 
strictly satisfy Fermat’s Principle at any other pair of conjugates. As we will see, 
this failure to image a point into a point implies the presence of aberrations, a 
subject we explore in detail in subsequent chapters. In spite of this apparent 
limitation, the family of conic surfaces is the basis for most multi-mirror systems. 


3.6. FERMAT’S PRINCIPLE AND THE ATMOSPHERE 


In this section we consider some of the effects of the Earth’s atmosphere, 
refraction and its variation with zenith angle and wavelength, and the effect of 
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time-varying index changes on the path of a light ray. Our discussion is only an 
introduction to further illustrate the utility of Fermat’s Principle; more specifics on 
each of these topics will follow in subsequent chapters. 


3.6.4. ATMOSPHERIC REFRACTION 


Assume that the atmosphere is a flat, layered medium with the index n = n(z) 
only, hence the curvature of the atmosphere is neglected. In this case Eq. (3.1.10) 
becomes 

a . dn 
nK = neosa-— = — sina, (3.6.1) 
where the z-axis points toward the center of the Earth. The change in the index of 
the atmosphere from the top (n = 1) to the surface (n = 1.00029) is small, hence 
the path of a ray from a star is not deviated appreciably for a not close to 90°. 
Integrating Eq. (3.6.1) with the assumption that « is nearly constant, hence cos « 
and sin « brought out from the integral, gives 


ôx = — tana ôn = —(n — 1) tan ay (3.6.2) 


where a is the angle of incidence at the top of the atmosphere, or zenith angle, 
and ôn is the change in index. 

For a ray passing downward through the atmosphere ôn = (n — 1) > 0, and 
hence da < 0. Thus the angle the ray makes with the z-axis decreases as the ray 
proceeds down through the atmosphere, that is, the ray is bent “toward” the z- 
axis. That the effect is small is seen by taking, for example, 7) = 45° and finding 
the ray deviation da = 0.00029 radians or about 1 arc-min. 

The index of refraction of the atmosphere is a function of wavelength, as 
shown by the entries in Table 3.1, hence the deviation ôx is not the same for 
different wavelengths. The parameter Rọ in Table 3.1 is the constant of refraction, 
the index difference ôn expressed in units of arc-seconds. 

The change d(6a) is the differential atmospheric refraction, with 


d(x) = — tan % d(dn) = —(n, — n,) tan % (3.6.3) 


and d(én) = n, — n,, the change in index between two wavelengths 2, and /,. 
From the values in Table 3.1 we see that the index changes more rapidly at shorter 
wavelengths, hence differential refraction could adversely affect certain types of 
observations in the near ultraviolet at large zenith angle. As an example using the 
entries in Table 3.1, d(da) in arc-seconds is about 1.38 tan w over the range from 
320 to 400nm, and 2.48tana) over the range from 320 to 550nm. With 
tan a > 1, for example, the visible image of a star centered on a small aperture 
could result in no ultraviolet light passing through the aperture. 
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Table 3.1 


Index of Refraction of Atmosphere* 





4 (nm) n-1 Ro (are-sec) 
320 3.049E-4 62.86 
400 2.982 61.48 
550 2.929 60.38 
700 2.907 59.93 

1000 2.890 59.58 


“Values of n from Allen (1973). Index 
given at T = 0°C, pressure = 760 mm 
Hg, water vapor pressure = 4 mm Hg. 


3.6.b. ATMOSPHERIC TURBULENCE 


The assumption that n = n(z) neglects variations in index that are present in a 
turbulent atmosphere at constant height due primarily to temperature fluctuations. 
Consider a ray that enters the atmosphere from directly overhead, with the 
deviation of the ray from a vertical path denoted by « Assuming a < 1 we can 
write Eq. (3.1.10) as 


n(ðx/ðz) = an/ay, (3.6.4) 


where the term in sina is dropped because « is small. Letting n = 1 + dn, Eq. 
(3.6.4) becomes (to first order) 


da/dz = d(dn)/dy, (3.6.5) 


where ôn is the fluctuation in the index of refraction from the local mean. In the 
general case there are corresponding equations in which x replaces y. Integrating 
Eq. (3.6.5) from the top of the atmosphere (z = 0) through a distance s gives 


S 


a(s) = [anan dz. (3.6.6) 


The deviation given by Eq. (3.6.6) is, of course, a function of time, with random 
variations in time for a, and «,. Because (dn) is zero, where () denotes an 
average over time, the time-averages of the deviations are also zero. The mean- 
square deviations, however, are not zero, and the net result is a ray that wanders 
randomly about a mean position. 

The net effect of these variations leads to the phenomenon called seeing. In a 
small telescope the effect is seen as a star image in motion with excursions 
typically of a few arc-seconds. In a large telescope the cumulative effect of seeing 


is to give a blurred image with little or no motion of the image as a whole. 
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Although the approach using Fermat’s Principle shows the origin of seeing 
effects, the statistical processes that lead to the effects described make it 
impractical to proceed further with this approach. Selected results based on a 
statistical approach to atmospheric turbulence are given in Chapter 16. 


3.7. CONCLUDING REMARKS 


3.7.a. RAYS AND WAVEFRONTS 


The application of Fermat’s Principle to find conic surfaces that are perfect 
mirrors makes use of rays and optical path lengths. A different way of looking at 
what a focusing system does is in terms of wavefronts. A wavefront is simply a 
surface on which every point has the same optical path distance from a point 
source of light. In a homogeneous medium this surface is obviously a sphere 
whose center is the point object. In this same medium rays are radial lines 
directed outward, and at each point on a wavefront a ray is perpendicular to the 
wavefront. If a source is effectively at infinity, as for a star, then the resulting 
wavefront is plane. 

Examples of wavefronts are shown in Fig. 3.11a,b, with a vertical chief ray at 
the center of each wavefront in this representation. The wavefront designated 
random is a plane wave plus point-by-point variations using a random number 
generator. This wavefront might, at least approximately, represent a plane wave 
after passing through a slightly turbulent atmosphere. 

A perfect optical system that satisfies Fermat’s Principle is one that converts a 
spherical wavefront centered on a point object (or plane wavefront for a distant 
object) to a spherical wavefront centered on the conjugate point image. 

Conversely, if Fermat’s Principle is not satisfied for all rays from a point object 
over a large aperture, then the wavefront converging toward the image is no 
longer spherical and the image has aberrations. The connection between ray and 
wavefront aberrations is established in the discussion in Chapter 5. 


3.7.6. HOW PERFECT IS “PERFECT”? 


Fermat’s Principle as used thus far is concerned only with rays and ignores the 
wave nature of light. Because of the wave character of light, no image is perfect in 
the sense that it is a point of infinitesimal size. The question to be addressed 
therefore, albeit not from a rigorous point of view here, is the minimum size of an 
image given an otherwise perfect optical system. 

Consider an optical system L that is perfect according to Fermat’s Principle, as 
shown schematically in Fig. 3.12. Light from two distant point sources, A and B, 
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Fig. 3.11. Schematic diagrams of wavefronts: (a) spherical, (b) randomly distorted plane. 


with angular separation 9, fills the aperture of diameter D. According to wave 
theory, two image points cannot be resolved or separated if the difference in light- 
time travel of rays to them from opposite edges of the aperture is less than 
approximately one period of the wave. Equivalently, the points cannot be resolved 
if the optical path difference, or OPD, between these rays is less than approxi- 
mately one wavelength. In Fig. 3.12 the OPD between these rays is A with the 
resolution limit set by A ~ 4 From the geometry we see that 


Omin © A/D. (3.7.1) 


From the angular resolution limit in Eq. (3.7.1) we can infer that the individual 
images A’ and B’ must each have an angular diameter 0 ~ 1/D, as seen from L. If 
the angular diameter of each image was substantially smaller than A/D, then the 
images would be resolved, contrary to the limit set by Eq. (3.7.1). 
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Fig. 3.12. Schematic of perfect optical system from which approximate diffraction limit is 
derived. See Eq. (3.7.1) and Section 3.7. 


The reasoning used to arrive at Eq. (3.7.1) and an estimate of the minimum 
possible image size is not a rigorous procedure, nor does it tell how the light is 
distributed within the image. A more rigorous approach requires analysis using 
diffraction theory, a topic we consider in some detail in Chapter 10. 
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Chapter 4 Introduction to Aberrations 


Thus far our discussion of optical systems has proceeded along two different 
lines. In Chapter 2 we developed the paraxial equations for spherical refracting 
and reflecting surfaces, and noted that in the paraxial limit there is a one-to-one 
correspondence between object and image point. In Chapter 3 we turned our 
attention to Fermat’s Principle and reflecting surfaces of conic cross section. Our 
analysis led to the result that for a given pair of conjugate object and image points 
there is a conic surface that gives a perfect image, independent of the paraxial 
approximation. 

In this chapter we begin to examine what happens when Fermat’s Principle is 
not strictly satisfied in the range outside of the paraxial approximation. We will 
see that the geometrical image in this case is no longer a point but becomes a blur. 
An optical system that produces a blurred image, where the blur is in addition to 
the diffraction blur noted in Section 3.7, is a system with aberrations. 

To illustrate the onset of aberrations we consider a very simple optical system, 
a single conic mirror. After calculating the aberrations of several such mirrors for 
selected object points, we introduce the topic of aberration compensation. By this 
we mean that the aberrations of one optical element can be offset, wholly or 
partially, by those of another element. The two systems considered in this chapter 
are the Schmidt camera and the family of Cassegrain telescopes. This discussion 
is only an introduction; a more complete description of aberrations and compen- 
sation follows in Chapter 5. 
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4.1. REFLECTING CONICS AND FOCAL LENGTH 


We begin by calculating the focal length of a concave mirror or, more 
specifically, the distance from the mirror vertex to the point where a reflected 
ray from a distant object intersects the optical axis. Figure 4.1 shows a ray parallel 
to the optical axis striking a mirror at height r, where r is defined by Eq. (3.5.5). 
Contrary to our normal convention, the light from the object proceeds from right 
to left, a choice made for convenience. With this choice distances to the right of 
the mirror vertex measured along the z-axis are positive. We also take the angle 
in Fig. 4.1 positive when r > 0. 

From the geometry of Fig. 4.1 we see that f = z+ Zz), where 


bk _ r(i — tan? ) 
On tan2$  2tanġ 
From Fig. 4.1 we also note that tan œ is simply dz/dr, the negative of the slope of 


the normal to the mirror. 
From Eq. (3.5.5) we find the relation 


dz r 





(4.1.1) 


Ro Ra eke GEK =tanġ. 
Substituting this into Eq. (4.1.1) gives 
r| R= (0 +K)Xz r 
So E a 4.1.2 
a Al r aore] Oho 
Putting Eq. (4.1.2) into f = z+ Zo gives 
_ 2 
Lt ee r (4.1.3) 








Fig. 4.1. Geometry of ray from distant object reflected from concave mirror. 
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Because the surface equation (3.5.5) is quadratic in z, the solution for z contains a 
square root, as will Eq. (4.1.3) when z is eliminated. We proceed, therefore, by 
expanding the square root as a power series in small quantities before substituting 
into Eq. (4.1.3). Solving Eq. (3.5.5) for z gives 


R r 1/2 


SE ay ANa a (4.1.4) 
2R 8R? 16R5 ` ad 
Substituting Eq. (4.1.4) into Eq. (4.1.3) gives 
Kyr? 1 K)r* 
coos Ce OE ie, (4.1.5) 


Examination of Eq. (4.1.5) shows that f = R/2 for K = —1, a paraboloid. 
Although higher power terms are not included in Eq. (4.1.5), this statement 
about a paraboloid is true when all terms are included. This is easily verified by 
setting K = —1 prior to making the foregoing substitutions. 

For a sphere or ellipsoid the conic constant K > —1 and f < R/2, while for a 
hyperboloid f > R/2. As expected, Fermat’s Principle is strictly satisfied, hence f 
is constant for any r, only for a paraboloid when the object is at infinity. For any 
other conic the change in focal length Af as a function of r is 


F (1+K)r? (14+ K)34+K)r4 
Af = f(r) — f (paraxial) = aR 16R3 oo 
Thus for any conic surface other than a paraboloid the image of a distant object 
on the optical axis is blurred. Examination of Eq. (4.1.6) shows that Af is 
independent of the sign of r, hence the blur is symmetric about the z-axis. Note 
also that a change in the sign of R changes the sign of Af, as it should. 





(4.1.6) 


4.2. SPHERICAL ABERRATION 


We now examine in detail the nature of the aberration for the case of an object 
on the optical axis, an aberration called spherical aberration. Of particular 
interest is the size of the blur, measured perpendicular to the optical axis, at or 
near the paraxial focus. 


4.2.4. TRANSVERSE AND LONGITUDINAL 


We define the transverse spherical aberration (TSA), as the intersection of a 
ray from height r on the mirror with the paraxial focal plane, as shown in Fig. 4.2. 


4.2. Spherical Aberration 51 


It is conventional to define the longitudinal spherical aberration (LSA), as the 
distance from the paraxial focal plane to the point where a ray from height r 
crosses the z-axis. In Fig. 4.2 we see that LSA is simply Af. From similar 
triangles there we find 


TSA/LSA = r/(f —2), 


where both TSA and LSA are negative in Fig. 4.2. Using Eqs. (4.1.4)—(4.1.6), 
applying the binomial expansion, and retaining all terms through fifth order, gives 


r r 


TSA = ~(1+K) 555 —3(1+ KIB + K) gag ts (4.2.1) 


Each term is designated according to the power of r. The first term is the third- 
order transverse spherical aberration (TSA3); the second term is fifth-order 
transverse spherical aberration (TSA5). For K = 0, a spherical surface, each 
term in Eq. (4.2.1) is negative for r > 0, and positive for r < 0. The sign of TSA 
indicates where a given ray crosses the paraxial focal plane, in accord with the 
sign convention established for distances measured perpendicular to the z-axis. 
Because of the presence of the factor (1 + K) in Eq. (4.2.1), the sign of TSA for a 
hyperboloid is opposite that for a sphere or ellipsoid. Note also that the sign of 
TSA is independent of the sign of R. 

The relative size of the two terms in Eq. (4.2.1) for rays from the edge of the 
aperture is given by 


TSAS 33+K)r _3(3+K) 








TSA3 4R2 — GFP '’ 





Fig. 4.2. Transverse spherical aberration (TSA) at paraxial focus. See Eqs. (4.1.6) and (4.2.1). 
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where F is the focal ratio. For a sphere TSA5 is 10% of TSA3 when F = 1.19. 
Thus it is sufficient to neglect the TSA5 term for all but very fast mirrors, that is, 
those with small focal ratios. 

We can also find the spherical aberration for the case shown in Fig. 4.2 by 
working directly with surface equations, one for a paraboloid and one for a 
surface with conic constant K. From Eq. (4.1.4) we find the difference between 
the surfaces, through terms in 74, given by 


4 
Az =z — 2K) = ~(1+ Keg (4.2.2) 


where the subscript p denotes the paraboloid. From Fig. 4.3 we see that the path 
difference between two rays, one incident on the paraboloid and one on the other 
surface at the same height, is approximately 2 Az, provided the angles ¢ and ¢,, 
are small. 

We also see from Fig. 4.3 that the directions of the reflected rays differ by 
2(o, — P), where @ = dz/dr, $, = dz,/dr, in the paraxial approximation. From 
Eq. (4.1.4) we find 


d P 
2(¢, - $) = ay (2 Az) =-(1 +K) a5: (4.2.3) 


We now relate this difference in ray directions to TSA. 


4.2.b. ANGULAR 
This difference in direction between the reflected rays is the angular spherical 


aberration (ASA). Because Eq. (4.2.3) is only taken to third order, it is 
appropriate to say ASA3 for the difference in direction given in Eq. (4.2.3). 


"az 





Fig. 4.3. Path difference between ray reflected from paraboloid (solid curve) and conic (dashed 
curve). Size of Az, given in Eq. (4.2.2), is greatly exaggerated in the diagram. 
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From the geometry in Fig. 4.4 we see directly the relation between transverse 
and angular aberration, which is 


TSA3 = (R/2)(ASA3) = —(1 + D: (4.2.4) 


where, as before, we assume the angle @,, is not too large. This result is the same 
as the first term in Eq. (4.2.1). 

We need to review briefly the steps taken in arriving at Eq. (4.2.4) and the 
approximations used. In stating that the path difference between the two rays is 
2 Az, we replaced any cosine factors present in the exact difference by unity. In 
effect, we used the paraxial approximation in making this statement. The same 
approximation was made in writing the relation between TSA3 and ASA3 in Eq. 
(4.2.4). This approximation is quite good, even for a mirror as fast as f /2. In this 
case we find tan ġ = 0.25 and cos¢@ = 0.97, hence our third-order result is 
accurate to a few percent. If we wanted to use the same method to find the fifth- 
order term, we could not use the paraxial approximation but would need to retain 
higher power terms in the expansions of tangents and cosines of angles. But, as 
already noted, third-order aberration results suffice for reflecting surfaces in most 
optical systems. 

The procedure followed to get the third-order result in Eq. (4.2.4) can be 
generalized to any pair of object and image conjugates. All that is needed is Az, 
the difference between the reflecting surface that images that object without 
aberration and the actual surface, with the paraxial approximation used in the 
same way as in the preceding. The relations are 


d 
ASA3 = io Az), TSA3 = s' — (2 Az). (4.2.5) 
dr dr 





Fig. 4.4. Relation between TSA and angular difference between ray paths after reflection. See Eq. 
(4.2.3). 
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Equations (4.2.5) apply specifically to reflecting surfaces in air in which the 
mirror is oriented as shown in Fig. 4.1. In general the optical path difference, 
which includes the index of refraction, is required by Fermat’s Principle and the 
calculation of Az. The more general discussion of Eq. (4.2.5), including the index 
of refraction, is given in Chapter 5. 

The importance of the procedure outlined in the paragraph preceding Eq. 
(4.2.5) lies in its utility when applied to optical systems with more than one 
surface. In Chapter 5 we develop the method by which Az can be determined in a 
general way for any optical system with any object location. Once Az is known, it 
is then a straightforward matter to calculate the angular and transverse aberra- 
tions. 


4.2.c. EXAMPLE: SPHERE WITH FINITE CONJUGATES 


As an illustration of the utility of Eqs. (4.2.5) we consider an object point at a 
finite distance and an ellipsoid with the correct conic constant needed to form a 
perfect image. If a sphere is used in place of the ellipsoid, aberration is present in 
the image. Following the preceding prescription, we find the difference Az 
between these two surfaces. From Eq. (4.1.4) we get 


Az =z, — Z, = K,r*/8R’, 
through the terms of interest. Therefore Eqs. (4.2.5) give 
ASA3 = K,(r? /R°), TSA3 = K,(r°/R°)s’, (4.2.6) 


where the range of K, for the ellipsoid is —1 < K, < 0 for real conjugates. It is 
convenient to rewrite Eqs. (4.2.6) in terms of the transverse magnification m. 
Eliminating s between Eqs. (2.3.1) and (2.3.3) gives s’/R = (1 — m)/2. Substitut- 
ing this and Eq. (3.5.4) for K we get 


1\? 
ASA3 = -(24 r) 5 (4.2.7a) 
ly 7 
TSA3 = pete E (4.2.7b) 


These relations give the spherical aberration of a sphere used at magnification m. 
Note that m < 0 for real conjugates, hence TSA for a concave spherical mirror 
always has the same sign for a given r, independent of the sign of R. This sign is 
such that the focus for marginal rays, those rays reflected from the edge of the 
mirror, is closer to the vertex than the paraxial focus, as shown in Fig. 4.2. Note 
that the substitution of m = 0 into Eqs. (4.2.7) gives the same ASA3 and TSA3 
as when K = 0 is substituted into Eqs. (4.2.3) and (4.2.4), as expected. As a final 
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comment, note that the spherical aberration is zero when m = —1. For this 
magnification s = s’, and the sphere is the perfect surface according to Fermat’s 
Principle. 


4.2.d. DISTRIBUTION OF RAYS NEAR FOCUS 


As given by the preceding relations, TSA is a measure of the image size of a 
point object at the paraxial focus. The distribution of rays near the paraxial focus 
is such that the image has a minimum size between paraxial focus and the focus 
for marginal rays. Here we consider the ray distribution as seen in cross section, 
both along and perpendicular to the optical axis. 

A cross section of the image along the z-axis near paraxial focus is shown in 
Fig. 4.5 for a spherical mirror with m = 0 and focal ratio F = 2. The paraxial 
focus is at the origin of the (7, 7) coordinate frame. Each ray is drawn so that it 








TSA (min) 








Fig. 4.5. Ray distribution near paraxial focus for image with spherical aberration. Paraxial focus is 
at (0, 0). See Eq. (4.2.8) for definition of parameters. 
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crosses the z-axis at a distance LSA3 (or Af) from the paraxial focus and 
intersects the paraxial focal plane at height TSA3. These coordinates are 


Y= —?r/2R’, at z =0, (4.2.8a) 
z = -?r’/4R, at r =0, (4.2.8b) 


where 7’ = TSA3 from Eq. (4.2.4) and 7 = Af from Fq. (4.1.6). Note that the 
vertical scale in Fig. 4.5 is stretched relative to the horizontal scale. It is easy to 
see in Fig. 4.5 that the image width at the blur of minimum diameter is about four 
times smaller than TSA3 at the paraxial focus. The image at its minimum size is 
called the disk or circle of least confusion (clc). The location of the cle, z’(clc), is 
given by z’(clc) = 0.75 z(marginal), hence r?(clc) = 0.75 r?(marginal) from Eq. 
(4.2.8b). It follows that r(clc) = 0.866 r(marginal), a result we use in our analysis 
of a Schmidt camera in Section 4.4 of this chapter. Analytical calculations, such 
as those given by Welford (1986) support these graphical conclusions. 

Image cross sections perpendicular to the optical axis, commonly called spot 
diagrams, are shown in Fig. 4.6. From a point source at infinity, a set of rays 
distributed uniformly over the aperture of this mirror is traced through the system. 
The cuts through the bundle of rays are shown equally spaced between the 
marginal focus on the left and the paraxial focus on the right. Note that the 
concentration of rays in the center of the image is somewhat smaller in the middle 
image in Fig. 4.6, even though the overall image diameter is larger than at the clc. 





THROUGH FOCUS SPOT DIAGRAM 


Fig. 4.6. Spot diagrams near paraxial focus for image with spherical aberration. Images are 
equally spaced between marginal focus (MF) at the left to paraxial focus (PF) at the right. 
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We show in Chapter 10 that this image, the one midway between paraxial and 
marginal focus, is the one with the minimum root-mean-square (rms) wavefront 
error. 

From our discussion we see that the diameter of the circle of least confusion is 
[> /4R?| when the object is at infinity. At the mirror this blur subtends an angle « 
where 


a = r° /2R? = 1/128F". (4.2.9) 


As the focal ratio F increases, the subtended angle « decreases. A point is 
reached, however, where the image diameter no longer decreases but reaches the 
limit set by diffraction according to Eq. (3.7.1). The smallest F for which a 
spherical mirror used to image a distant object is approximately diffraction- 
limited is found by equating « in Eq. (4.2.9) to 0 in Eq. (3.7.1). The result is 
D ~ 128AF?, in general, or D ~ 0.007F?, for 2 = 550 nm. As examples, for 
green light, we find F ~ 11 for D = 10 cm, and F ~% 24 for D = 1 m. Thus, in 
spite of spherical aberration, a spherical mirror in collimated light is effectively 
diffraction-limited, provided the focal ratio is large enough. 

An interesting exercise left to the reader is to take D ~ 128AF°, solve for A, 
and substitute the result into Eq. (4.2.2). Setting K = 0 we find that Az, the 
difference between a paraboloid and a “diffraction-limited” sphere at the margins, 
is approximately 1/8. Hence the path difference between two marginal rays from 
the two mirrors is ~//4. Alternatively, the wavefront emerging from the 
spherical mirror is no longer spherical but differs from the spherical wavefront 
emerging from the paraboloid by 4/4 at the margin. Although this limit is found 
here in a special case, it turns out that this is a useful criterion for establishing 
when any optical system gives images that are approximately diffraction-limited. 


4.3. REFLECTING CONICS AND FINITE OBJECT DISTANCE 


The analysis in Section 4.1 leading to Eq. (4.1.6) and an expression for 
spherical aberration in Eq. (4.2.1) is restricted to an object at infinity. For 
completeness we extend this procedure and consider an object at a finite distance. 
We outline the method by which s’ and TSA through fifth-order can be 
determined, and give the derived relations. 

Consider a concave mirror with an object located on the optical axis at a finite 
distance s. The geometry of a ray intersecting a mirror at height r from the optical 
axis is shown in Fig. 4.7. From the triangles in Fig. 4.7 we get 





r dz r 
= ; tan = ——, = — "oS , 4. .1 
a S—Z B sS —z ong dr R-(14+K)z heed 
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Fig. 4.7. Geometry of ray from point source at finite distance reflected from a concave mirror. 


where @ = y +a = B — y. Solving for s’ we find 


; 7 r _ [1 + tan 6(2 tan o — tan ¢)) 
. ~ 7 anpa  2tanġ —tana(l — tan?) (32) 





The procedure now is to take each tan function in Eq. (4.3.2), expand it as a 
power series in r/R, and substitute for z with Eq. (4.1.4). After considerable 
algebra we find that the resulting lateral spherical aberration LSA or As’ is given 


by 


r iy 
As = s — s = =p vf + (==) | 


x {1+ otk +3 42m + DI}, (4.3.3) 
4R? 
where sọ is the paraxial image distance. 
We now proceed to the transverse spherical aberration. From Fig. 4.7 we see 
that TSA = LSA tan $ = r As’/(s’ — z). Using Eq. (4.3.3) and the expansion of 
Eq. (4.3.2) we find 


TSA = -55(5) [k+ (4) |} “OL g CDI 


Equations (4.3.3) and (4.3.4) can be used as given, but when comparing results 
from these equations with those given by ray-tracing programs it is necessary to 
express r in terms of y, the height on the tangent plane to the mirror of the 
incident ray. The geometric relation between r and y follows from Fig. 4.7. Using 
the relation for tana in Eqs. (4.3.1) and noting that tana = y/s, we get 


r=x(1-3) =(1- (<7) 5). (4.3.5) 
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Substituting Eq. (4.3.5) into Eq. (4.3.4) we find 


sasa [e+ (222) O ES 


(4.3.6) 





where TSA = TSA3 + TSAS. Note that choosing K in Eq. (4.3.6) according to 
Eq. (3.5.4) gives zero TSA, as expected. It is left as an exercise for the reader to 
verify that Eq. (4.2.7b) follows from Eq. (4.3.6) with K = 0. 

Comparing the third-order terms in Eqs. (4.3.4) and (4.3.6) we see that they 
are the same if r = y. In the paraxial approximation the height of a ray at the 
surface is always the same as its projection on the tangent plane to the surface. It 
turns out in general that all third-order aberrations can be expressed entirely in 
terms of paraxial (or first-order) parameters. This is not true for aberrations of 
higher order which are affected by the size of those of lower order. This effect is 
evident here in the fifth-order terms in Eqs. (4.3.4) and (4.3.6). Fortunately, third- 
order aberration results usually suffice when analyzing an optical system. 


4.4. OFF-AXIS ABERRATIONS 


We now turn our attention briefly to off-axis aberrations, those aberrations 
present when the object point does not lie on the optical axis. In this section we 
want only to indicate the nature of these aberrations; a general discussion of this 
subject is the topic of the next chapter. 





Fig. 4.8. Collimated beam at angle @ with optical axis incident on paraboloid. Here O’ is the 
origin of the rotated coordinate system; point C lies on both the z and z’ axes. 
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To illustrate the source of off-axis aberrations we consider a special case, a 
paraboloid in collimated light. Figure 4.8 shows a cross section of a paraboloid 
with optical axis z and vertex at O. The image at B is, of course, a perfect one 
geometrically. The same paraboloid images a distant point object at angle 0 from 
the z-axis at point B’, where the distance BB’ is approximately f8. 

To determine the kinds of aberrations present in the image at B’, we find a 
system that takes the rays at angle 0 and forms a perfect image at B’. This system 
is obviously a paraboloid whose optical axis is parallel to the incident beam and 
passes through B’, with its vertex at distance f from B’. The coordinate system for 
this paraboloid is shown in Fig. 4.8; the optical axis is denoted by 7 and the 
vertex is at O’. We then find the distance Az between these two paraboloids in the 
yz plane and use Eq. (4.2.5) to find the third-order aberrations. 

Omitting the details of these steps we find 


yo 
R2 
where AA3 represents angular aberration to third-order. 

The terms in Eq. (4.4.1) represent different aberrations: The first is coma, the 
second is astigmatism, and the last is distortion. The character of each aberration 
is quite different because of the different way in which each depends on y and 0. 
Our following description of each aberration is limited to the yz-plane and is, 
therefore, incomplete. A complete description, based on rays over the entire 
aperture, is given in Chapter 5. 

Coma is proportional to y*@ and hence is changed in sign when 0 changes 
sign. Coma is invariant to the sign of y and therefore rays from opposite sides of 
the mirror are on the same side of the central ray in the vicinity of the paraxial 
focus. 

Astigmatism is proportional to ye and hence is unchanged by a sign change in 
0. A change in the sign of y changes the sign of the astigmatism and therefore 
rays from opposite sides of the mirror are on opposite sides of the central ray near 
paraxial focus. 

Distortion is proportional to 6° and does not depend on y. Thus this aberration, 
if it is the only one present, does not affect the image quality, only its position. 
For a set of point objects equally spaced perpendicular to the optical axis, the set 
of images would not be equally spaced if distortion is present. 

There is one final aberration that is present in Fig. 4.8 (it does not appear in 
Eq. (4.4.1)); the aberration is called curvature of field. Its character is most easily 
seen by noting that the transformation that takes the origin from O to O’ is 
essentially one of a rotation about the center of curvature C. Thus the motion of 
point B in Fig. 4.8 to B’ is along a circular arc whose center is C. The foci for 
different 0, in the absence of other aberrations, are located on a curved surface, 
hence the name curvature of field. 


g? 
AA3 = 3a, + 2a, + 036°, (4.4.1) 
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At this stage five third-order aberrations have been identified: spherical 
aberration, coma, astigmatism, distortion, and curvature of field. The first of 
these is independent of field angle, but all others depend on some power of 8. The 
first three aberrations in this list affect image quality, while the last two affect only 
image position. 

From Eqs. (4.2.7b) and (4.4.1) we see that the transverse aberrations depend 
on aperture radius y (or r) and field angle 0 according to the relation 


transverse aberration « y"0”, (4.4.2) 


where n-++-m=3. Hence each of these aberrations is called a third-order 
aberration. The main task in the analysis of the image quality in any optical 
system is the determination of how much of each of these aberrations is present, 
and then eliminating or reducing the amount of each by proper selection of 
system parameters. 


4.5. ABERRATION COMPENSATION 


In Section 3.7 we noted that a perfect optical system is one in which the 
wavefront emerging from the final surface is spherical. From the discussion in 
this chapter it is evident that there is a close relation between deviations from a 
spherical wavefront and the appearance of aberrations. Along any ray the actual 
wavefront may be behind or in front of the ideal wavefront, depending on whether 
that portion of the wavefront has been retarded or advanced. 

Although the analysis so far has been limited to aberrations of a single mirror 
optical system, it should be evident that compensation of aberrations, wholly or in 
part, should be possible in systems with more than one surface. In terms of 
Fermat’s Principle, exact compensation means that a wavefront advance intro- 
duced by one or more surfaces is canceled by an equal wavefront retardation 
introduced by other surfaces. As far as the final wavefront is concerned, it is only 
the net advance or retardation that determines the size of any image defect. 

In this section we examine two systems, a Cassegrain telescope and a Schmidt 
camera, for each of which the net third-order spherical aberration is zero. Each 
system is composed of two optical elements chosen so that a wavefront advance 
due to one element is balanced by an equal retardation introduced by the other. 
The object point for each is a distant point source on the optical axis. 


4.5.4. CASSEGRAIN TELESCOPE 


The configuration of mirrors in a Cassegrain telescope is shown in Fig. 2.7a. 
Based on the discussion in Section 3.4, one pair of mirrors for which the spherical 
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aberration is zero is a paraboloidal primary and a hyperboloidal secondary. The 
former is the perfect mirror for a point object at infinity, while the latter is perfect 
for finite conjugate points on opposite sides of the mirror. From Eq. (3.5.4) we 
find the conic constants 


(m+ 1) 


Me OS aye 


(4.5.1) 


where m is the magnification of the secondary. Choosing a value of p sets the 
normalized distance from the vertex of the primary to the final focal point. With 
the values of m and f chosen, the paraxial relations in Eqs. (2.5.1) are used to 
find the values of k and p. If, for example, m=5 and ß =0.2, then 
k=0.2, p = 0.25, and K, = —2.25. The telescope specification, specified so 
far only in terms of normalized parameters, is completed when the primary mirror 
diameter and focal length are chosen. 

The paraboloid-hyperboloid combination specified in Eq. (4.5.1) is called a 
classical Cassegrain. We now show how this configuration can be changed into a 
different one by changing the conic constants of both the primary and secondary 
mirrors. This is done in a way that keeps the net third-order spherical aberration 
(SA3) zero, hence a change in K, is accompanied by a change in K, such that the 
wavefront advance at one mirror is equal to the wavefront retardation at the other, 
to third-order. Stated in terms of Fermat’s Principle, the OPL from the object to 
the image along any ray does not change. 

Starting with the classical Cassegrain configuration in Fig. 4.9, each surface is 
changed into a different conic by “bending” the original surface. If the new 
surfaces lie to the left of the original surfaces, as shown in Fig. 4.9, then the 
wavefront has been advanced at the primary and retarded at the secondary. The 
advance and retardation are 2 Az, and 2 Az,, where Az, and Az, are the surface 
differences at the primary and secondary, respectively. Using Eq. (4.1.4) each 
surface has z as follows: 


z, (original) = ——~ 


OR,’ 
z,(new) = Ks +0+K) BR 
iw m+1\*7 4 
z,(original) = Be [ — (=) H 
2 
z (new) = YB 2240 dR 


2R, BRS 
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Fig. 4.9. Cassegrain telescope schematic. Classical Cassegrain has mirrors shown by solid curves; 
modified Cassegrain has mirrors shown by dashed curves. The R;, Ry, m, and k are the same for both 
telescopes. Surface differences are given in Eqs. (4.5.2). 


Therefore 
4 
_ yy (4.5.2a) 
2Az, =(14+ K,)—, 
1 ( + DR 
24a =k (Z5) | 2, .5.2b) 


where R, and R, are held constant. Applying the condition that the advance 
equals the retardation requires 2 Az, = 2 Az}. Note that this is equivalent to 
stating that the optical distance from object to image is unchanged, hence 
Fermat’s Principle is satisfied for all rays from a distant point source. 

Applying this condition gives 


4 p3 2 4 2 
y Ry m+1 k m+1 


As an example, take the values of the paraxial parameters already given here for 
the classical Cassegrain and choose K,=0. From Eq. (4.5.3) we get 
K, = —0.7696. This combination of an ellipsoidal primary and a spherical 
secondary is called a Dall—-Kirkham telescope. Another possible choice is 
K, = —1.02 and, from Eq. (4.5.3), K, = —2.4453. It turns out that this choice 
of conic constants gives a telescope called the Ritchey—Chretien telescope. 

The solutions of Eq. (4.5.3) represent the family of Cassegrain telescopes for 
which SA3 of a distant point source is zero. For a given set of k, m, and p there is 
an infinity of combinations of K, and K, that satisfy Eq. (4.5.3). In practice the 
choice of K, and K, from this set depends on other considerations, such as off- 
axis aberrations and the ease with which the mirrors can be made and tested. In 
the case of a Dall-Kirkham, for example, the separate mirrors are easily tested but 
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large coma results in a small usable field. On the other hand, the Ritchey— 
Chretien telescope has zero coma but hyperboloidal mirrors that are more difficult 
to make and test. Discussions of the aberrations of Cassegrain and other two- 
mirror telescopes are given in Chapter 6. 

It is important to note here that Eq. (4.5.3) and the procedure used to derive it 
do not ensure that higher-order spherical aberration is also zero. Except for the 
two-mirror telescope with conic constants given by Eqs. (4.5.1), other two-mirror 
telescopes generally have higher-order spherical aberration. 


4.5.b. SCHMIDT CAMERA 


A Schmidt camera is composed of three elements: a concave spherical mirror, 
an aperture stop whose center is located at the center of curvature of the mirror, 
and a refracting plate in the plane of the stop, as shown schematically in Fig. 4.10. 
For the moment ignore the refracting plate and consider only the mirror and stop. 
Placement of the stop at the center of curvature gives a system that is effectively 
axis-free. Rays through the stop from an off-axis object point “see” an optical 
system, a portion of the spherical mirror, which is the same as that for rays from 
an on-axis point. In effect, any line through C from an off-axis point is equivalent 
to the z-axis. This axis-free character is, of course, true only for the sphere 
because of its constant curvature. Therefore the aberrations of an image of any 
object point, for this arrangement of mirror and stop, are just those of an on-axis 
image, namely, spherical aberration. Because of the rotational symmetry about 
point C, the image surface is spherical and curvature of field is present but, as 
already noted, this aberration does not affect image quality. 

Because this very simple optical system is free of astigmatism and coma, it is 
the basis for cameras and telescopes that are designed for wide-field applications. 


aperture 
stop p 








Fig. 4.10. Schmidt camera configuration. Center of curvature C of spherical mirror is at center of 
aperture stop. Surface figure on glass plate at stop is given by Eq. (4.5.5). 
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Note that freedom from these aberrations is true for objects at any point to the left 
of the stop in Fig. 4.10, not just for collimated beams from distant point sources. 

The remaining optical element, the refracting plate or corrector, serves the 
function of correcting the spherical aberration due to the mirror, where the 
wavefront advance at the mirror is compensated by an equal wavefront retardation 
by the corrector. To find the required wavefront advance we take a distant object 
point, the only case considered here, and find Az between the sphere and a 
reference paraboloid. The latter surface is, of course, the surface that would give a 
perfect on-axis image. 

From Eq. (4.2.2) we find for K = 0 that the wavefront advance at the mirror is 


2 Az = —r*/4R?. (4.5.4) 


Consider a plane-parallel plate of thickness ¢ and index n. At any height y near 
one surface of this plate we remove a layer of air of thickness t and replace it with 
a layer of glass of optical thickness nt. The net change in optical path due to this 
change is (n — 1)t for a ray parallel to the z-axis. Because the light is “slowed 
down” in the glass, this optical path difference is the required retardation and 


(n — 1)t = 2 Az = —7* /4R. (4.5.5) 
Defining n = r/rọ where rọ is the radius of the aperture stop, and noting that 
f = —R/2, gives 
a eae Ai h 
32(n— 1)?  512(n — 1)F4" 





(4.5.6) 


For an otherwise flat plate Eq. (4.5.6) defines the surface figure on one face 
required to correct the spherical aberration of the mirror. From the point of view 
of Fermat’s Principle it does not matter whether the figured surface faces the 
mirror or the incident light. In either orientation rays at the edge of the aperture 
are retarded relative to those near the axis. Note also that Eq. (4.5.6) applies only 
to the case where the perfect reference surface is a paraboloid. For objects whose 
distance is not effectively at infinity the appropriate reference surface is an 
ellipsoid, but the procedure for finding t is the same. 

We now consider how the corrector plate has caused the compensation of the 
spherical aberration of the mirror. Rays through the corrector near its center are 
essentially undeviated and hence are focused at the paraxial focal point of the 
mirror. Rays farther out on the corrector are deviated away from the z-axis 
because in cross section the corrector is a thin prism. If the effective prism angle 
is y, as shown in Fig. 4.11, then the net ray deviation in the paraxial approxima- 
tion is (n — 1)y. Because of this deviation both the point at which the ray strikes 
the mirror and its angle of incidence are changed, and the reflected ray is directed 
toward the paraxial focus. From the center of the corrector outward the angle 
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Fig. 4.11. Small section of corrector plate near edge, in cross section. Net ray deviation is 
(n — 1)y. 


increases as r* and is a maximum at the edge; thus the marginal rays are deviated 
by the largest amount. 

If the corrector had a constant index of refraction it would affect rays of any 
wavelength in the same way but, of course, this is not the case. Because n is not 
constant the deviation is also a function of wavelength. Denoting the deviation by 
6, a simple differentiation gives 


d5/dd = ydn/da. (4.5.7) 


Thus rays of different wavelength are directed in slightly different directions with 
the effect largest at the edge of the corrector. A system corrected at one 
wavelength is no longer corrected at other wavelengths and the image now has 
the aberration called chromatic spherical aberration. This image defect is always 
present when the corrector is a single element, but it can be minimized by 
selecting a different focal point for the system. 

Looking at Fig. 4.5 or 4.6 we see that the blur is smallest at the circle of least 
confusion. At this distance from the mirror neither the paraxial rays nor the 
marginal ones intersect the z-axis but, as noted following Eqs. (4.2.8), the rays 
from the zone at 0.8667, are in focus on the axis. It is also evident from Fig. 4.5 
that the maximum deviation necessary to bring all rays to this same focus is less 
than that required to bring the marginal rays to the paraxial focus. The reference 
surface needed to minimize the overall ray deviation is a paraboloid whose focal 
point is at the circle of least confusion in Fig. 4.5, hence it has a radius of 
curvature different from the one used to derive Eq. (4.5.4). 

The required change in R is 2 Af where, from Eq. (4.2.8b), we find 


z = Af = —(0.866r9)"/4R = —372/16R, 
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hence 
R' —R=2Af = —3r)/8R, (4.5.8) 


where R’ is the radius of curvature of the modified paraboloid. Substituting Eq. 
(4.5.8) into Eq. (4.1.4) gives 


/-2=5(7-7) w ro (4.5.9) 





2\R R] 16R3 


With reference to the new paraboloid the wavefront advance at the spherical 
mirror is 








3r r 
= -—. 4.5.1 
aie 8R3 4R egy) 
Equating the wavefront advance and retardation gives the surface figure as 
3 242 4 4 3 
SE kere en ae 1 - (4.5.11) 
8(n— 1) 4(n—1)R  512(n— 1)F* 2y? 


where t < 0 over the entire corrector aperture. In contrast to the corrector whose 
profile is given by Eq. (4.5.6), this corrector is thickest at its center. Comparing 
Eq. (4.5.11) to Eq. (4.5.6) we also see that an additional term has been introduced 
into the surface figure, one that amounts to including a radius term. Rewriting Eq. 
(4.5.11) we get 


r rí 
= — —-——_,,, 4.5.12 
TEIR 4an- DR G32) 
where R,, the radius of curvature of the modified corrector, is 
4(n — 1)R? 
= ————. 4.5.13 
"ap (4.5.13) 


Throughout the analysis leading to Eq. (4.5.12), we have carefully followed the 
sign conventions established in Chapter 2. From the diagram in Fig. 4.10 we see 
that R < 0, hence R, < 0 as well. In absolute terms t has its largest value when 
r = 0.866r%, as is easily verified by setting dt/dr = 0 and solving for r. The rays 
for which the deviation is a maximum are those for which dt/dr is largest in an 
absolute sense, which occurs at 7 = 0.5 and 4 = 1. The shapes of the corrector 
profile and the emerging wavefront, greatly exaggerated, are shown in Fig. 4.12. 

As a final item for the Schmidt camera we calculate the chromatic TSA present 
when the index n differs from the one used in the design profile. The starting 
point is the wavefront retardation of the corrector, (n’ — 1), where n’ is the 
variable. The change in retardation as a function of a change in index is tôn, 
where ôn = n’ — n and n is the design index. Equivalently, tôn is the optical path 
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incident retarded 
wavefront wavefront 


Fig. 4.12. Profiles, greatly exaggerated, of corrector and emerging wavefront. Ray shown at height 
/3/2 of full aperture is undeviated. 


difference (OPD) for index n’. Substitution of the OPD for 2 Az in Eq. (4.2.5) and 
setting s' = f gives the chromatic aberration. The result is 


dr 64(n — 1)F3 4n? 


Putting in 7 = 0.5 or 1 gives the chromatic TSA for the rays that have the 
maximum deviation or largest effective prism angle. The two values of y give the 
same TSA value in magnitude and the result in absolute terms is 


3 
TSA3 =f <(c6n) sfint = ot (1 - 7) (4.5.14) 


f on 
TSA3 = SPT (4.5.15) 
which is effectively the radius of the chromatic image. 

The prescription of the corrector profile required to correct the spherical 
aberration of a spherical mirror used in collimated light is correct through fourth- 
order terms in r. As noted in Eq. (4.2.1) there are higher-order terms in the 
expression for spherical aberration and Eq. (4.5.12) can be extended to eliminate 
their presence. The term in SAS is significant for cameras of small focal ratio and 
is considered in Chapter 7. For further details on aberrations of fifth and higher 
order, the reader should consult the books by Bouwers (1946) and Linfoot (1955). 

The introduction of the refracting corrector element gives an optical system 
with excellent image quality over a large field, with chromatic aberration setting 
the limit on the image quality. It should also be noted that the corrector does have 
an axis and therefore the camera is no longer axis-free. As a result there will be 
off-axis aberrations, though these aberrations are relatively small because the 
corrector is nearly a plane-parallel plate and is generally quite thin. For details on 
the magnitude of these off-axis effects the reader should consult an excellent 
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article by Bowen (1960). This article is also of interest in showing how the 
aberrations of a Schmidt camera can be calculated without recourse to Fermat’s 
Principle. 

The two systems treated in this section have only been examined in part. Our 
intent here has been to use Fermat’s Principle as a tool to facilitate the analysis of 
optical systems and to demonstrate its power in the process, at least for on-axis 
aberrations. The full capability of this tool, including analysis of off-axis 
aberrations, will be evident after the treatment in the following chapter. 
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Chapter 5 Fermat’s Principle and Aberrations 


At this point the stage is set for a general application of Fermat’s Principle to a 
surface of revolution and the derivation of its aberrations. The theory of 
aberrations, generally called the Seidel theory, is a classical subject and has 
been treated in detail by many authors, including Born and Wolf (1980), and 
Welford (1986). Excellent introductions to the theory of aberrations are given by 
Longhurst (1967) and Mahajan (1998). The treatment here leads to nothing new, 
but the approach is one that leads to results that can be easily applied to optical 
systems of specific interest to astronomers, such as telescopes, cameras, and 
spectrometers. Rather than simply citing results derived from the Seidel theory, 
we start with Fermat’s Principle and derive the desired relations in a systematic 
way. These aberration relations are then reduced to specific forms appropriate to 
given surface types, such as conic mirrors, spherical refracting surfaces, and 
aspheric plates as used in Schmidt cameras and telescopes. 


5.1. APPLICATION TO SURFACE OF REVOLUTION 


A sketch of a general surface of revolution about the z-axis is shown in Fig. 
5.1, with the origin of the coordinate system at the vertex of the surface. The 
homogeneous medium to the left of the surface has index n; the medium to the 
right has index n’. The object and image points are at Q and Q’, respectively, and 
an arbitrary ray from Q intersects the surface at B(x, y, z). Because the surface is 
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B(x,y,z) 
x 


Fig. 5.1. Path of arbitrary ray through refracting surface. Points Q and Q” lie in the yz plane; point 
B is on the surface. The chief ray passes through the origin of the coordinate system. 


symmetric about the z-axis, there is no loss of generality in placing Q and Q’ in 
the yz-plane. The surface equation is a generalized form of Eq. (4.1.4) through 
fourth-power terms of r as follows: 





r rí br’ 
SaR et a Kars Sela) 
Pr ATI+K b P ar 
dl R3 A |=it 8 ` Or) 


where « represents the quantity in brackets, and r? = x? +). The term in b 
explicitly includes the type of aspheric term required for corrector plates, one 
example of which is discussed in Section 4.5. The form of the b term in Eq. 
(5.1.1) is chosen to simplify its appearance in the aberration relations. In Fig. 5.1, 
the center of the aperture stop is located at the origin of the coordinate system. 
The case for an aperture stop displaced along the z-axis is considered later in this 
chapter. 
Applying Eq. (3.1.2) to the ray shown in Fig. 5.1 gives 


OPL = nf[OB} + n'[BO’, (5.1.2) 


where [QB] and [BQ’] are the segments of the ray to the left and right of point B, 
respectively. From the geometry in Fig. 5.1 we have 


[OB] = [Q — hy + (Zo — zy fae" 


5.1.3 
[801] = [0 - ny +- zy +2]. ( ) 
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The dashed line in Fig. 5.1 from Q to Q’ passes through the center of the aperture 
stop and intersects the surface at its vertex. This line represents the chief ray 
through the system with angles @ and @’ with respect to the z-axis. The OPL 
measured along the chief ray is simply —ns + n’s’, where s is negative and s’ is 
positive by the sign convention. 

The signs of other quantities in Fig. 5.1 are as follows: 0, 0’, W, and zp are 
positive, and h and zp are negative. The signs of 8 and @’ are the same as those of i 
and i’ in Fig. 2.1. With due regard for these signs, we have 


h = ssinð, Zo = scos 0, S=nth, 
(5.1.4) 
kK =s sin’, Zo = s cosl’, Ê =z +k. 


Substituting the relations in the first line of Eq. (5.1.4) into [QB] gives 


ay. r (1 cos aa (1l Ma 
jo) = -s|1- sino (1-88) 4% (fa - ascos0)| : 


The relation for [BQ’] is similar in form except that 0’ replaces 0, s’ replaces s, 
and the leading minus sign is dropped. The expression for [QB] is now 
transformed by applying the binomial expansion and retaining all terms through 
fourth order, with the result 








7 À y? (cos?@ cos@\ x? /1 cosé 
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A similar relation follows for [BQ’] once the changes noted in the preceding are 
made. The substitution of these relations for [OB] and [BQ’] into Eq. (5.1.2) then 
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gives the OPL for this general ray, as follows: 


OPL = (~ns + n's’) — y(n’ sin 0’ — n sin 0) 


y? [n cos? 6’ ncos*@ nn’ cos 0’ — ncosé 
s Ss R 


+ 





x[n n n cos —ncos@ 
2|s s R 


7 xy nsin /1 cos n' sin 0’ l cos 6’ 
2 kj s R s’ s! R 

7 y nsin / cos? 0 _ cos n'sin@’ (cos? 6 cos @’ 
2 Ss s R s' s’ R 


“AT 1 (fn n 0+5), > n/1 cosð\? 
+ la (F-3- R (cos ~ neos6)) + (2 88) 


























n' (1 cos? y? b 
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y (5 ) in (n' cos 0 — n cos o (5.1.5) 


Although Eq. (5.1.5) is a formidable equation in appearance, the application of 
Fermat’s Principle simplifies it considerably. We begin by noting that the first set 
of parentheses denotes the OPL for the chief ray. Because Fermat’s Principle is 
concerned with optical path differences and stationary values, as given in Eq. 
(3.1.1), it is appropriate to remove this term by defining ® as the OPD between 
the general ray and the chief ray. Given this definition we have 


® = OPL — OPL (chief ray) 
= Ay +417 + Ax? HA HARY + Agr’, (5.1.6) 


where the A,’s are the multiplying factors in Eq. (5.1.5). Applying Fermat’s 
Principle in the form 6(OPL) = 0 to Eq. (5.1.6) gives 


a® a aD 
—=0, —(OPL) = — = 0. 5.1.7 
x OP = 3 6.17) 
Equation (5.1.7) is satisfied for x = y = 0 only if Ay = 0, hence n’ sin@’ = n sin 0, 
which is simply Snell’s law for the chief ray. 

Before proceeding to a detailed analysis of the various terms in Eq. (5.1.5) we 
look at each term in an approximate form. Consider first the terms in x? and y*. In 


2 (OPL) = 
ox 
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the paraxial approximation each cosine is replaced by one and the terms are zero 
by Eq. (2.2.5). In the next level of approximation each cosine is replaced by 
1 — g*/2, where ¢ is 8 or 6’, and the terms are of the form x*@” and yg’. Thus 
terms 4,” and 4)x? in Eq. (5.1.6) represent astigmatism, as noted in the 
discussion accompanying Eqs. (4.4.1) and (4.4.2). For the next two terms, 
those in x?y and y’, each cosine is replaced by one and each sine by its angle. 
The corresponding terms in Eq. (5.1.6), 41y? and 4,x?y, represent coma, with 
A, = As in this level of approximation. The final term in Eq. (5.1.6), with all 
cosines replaced by one, is spherical aberration. 

Returning to Eq. (5.1.5) we see that one or the other of the terms proportional 
to the square of the distance from the surface vertex can be made zero by a proper 
choice of s’. The term in y? is zero if 

n' cos? 0 ncos?0 _ n cos 6! —ncos0 


s, s R , 





(5.1.8) 


where s; is the location of the tangential astigmatic image. As we will see, this 
image is a line image oriented perpendicular to the plane defined by the z-axis 
and the chief ray. Alternatively the term in x? is zero if 

n n_ n'cos6' —ncos6 


Ee es tee ee 5.1.9 
So sS R ( ) 
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where s/ is the location of the sagittal astigmatic image. This image is also a line 
but lying in the plane containing the z-axis and the chief ray. A sketch of these 
images as defined by selected rays is shown in Fig. 5.2. Note that s, = s; in the 
paraxial approximation where cos@ = cos’ = 1, and both Eqs. (5.1.8) and 
(5.1.9) reduce to Eq. (2.2.5), as expected. 

The separation between the two astigmatic images is found by solving Eqs. 
(5.1.8) and (5.1.9) for 1/s} and 1/s\, respectively, and taking the difference 
between the two expressions. The result is 


As! ee) pest arent “(1 3) 


sisi n' n? 
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where As’ = s/ — s}. To terms through 6” this expression reduces to 


As’ ROf] l 
== (5-5) (5.1.10) 


n S ns 


In Eq. (5.1.10) s’ suffices to locate the image if As’ < s; or s} Note that the 
separation of the astigmatic images, and thus also the lengths of the line images, 
is proportional to 6°. Derivation of the image length follows in the next section. 

At this point our analysis of Eq. (5.1.6) by means of Fermat’s Principle has 
given Snell’s law and the locations of the astigmatic images. In the next section 
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Fig. 5.2. Location and orientation of astigmatic line images. Tangential and sagittal images are 
denoted by T and S, respectively. 


we use these results to evaluate the remaining coefficients in Eq. (5.1.6), which, if 
non-zero, determine the magnitude and type of aberration present in an image. 


5.2. EVALUATION OF ABERRATION COEFFICIENTS 


For an optical system that satisfies Eq. (5.1.7) for any (x,y) within the 
aperture, each of the coefficients in Eq. (5.1.6) must be zero and the system is 
perfect. If one or more of these coefficients is nonzero, then aberrations are 
present. Not surprisingly, the size of a given aberration is directly proportional to 
the corresponding coefficient in Eq. (5.1.6). 

Before we evaluate each coefficient it is important to note that this analysis is 
limited to finding third-order angular and transverse aberrations only. As evident 
from the discussion following Eq. (5.1.7), this means retaining only those terms 
for which the sum of powers of @ and r, x, or y is not greater than four. Thus 4; is 
independent of 0, A, and 4; are each proportional to 0, and A, and Aj are each 
proportional to 6”. 
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Using Snell’s law and Eq. (2.2.5) to simplify the terms in Eq. (5.1.5) we get 


UK wi 1\/1 4 
4=-3{ Ke -n+6)-= (4-3) (5-5) (65.2.1) 


rfl 1 l 1 
A, =A, =O—|[ --- ee 2.2 
2= Ay 95 (: az 1), O27) 


where the first term in brackets in Eq. (5.2.1) represents the contribution from the 
nonspherical part of the surface. Note that there is no term in Eq. (5.2.2) 
involving K or b and thus any nonspherical surface component does not 
contribute to the aberration associated with this coefficient, provided the aperture 
stop is at the surface as in Fig. 5.1. We will see later that this statement about Eq. 
(5.2.2) is not true when the aperture stop is displaced from the surface. 

The evaluation of the remaining coefficients A, and A depends on the image 
distance chosen. For example, choosing s’ = s makes A‘, = 0, and the coefficient 
A, is evaluated by substituting s’ = s; into A, in Eq. (5.1.5). The result to second 
order in 0 is 





2 
n l l 
A228 (=) 5.2.3 

l 2 (z ~) ( ) 
If, on the other hand, the choice were s’ = s/, then A, = 0 and 

2 1 l 

s= (i) (5.2.4) 

2 \n's’ ns 

hence A, = —4}. In either case terms involving K or b are absent but, as with 4), 


they will be present when the aperture stop does not coincide with the surface. 

The difference in sign between A, and Aj is a measure of the differences 
between the marginal rays at the ends of each of the line images. As seen in Fig. 
5.2, the marginal rays in the yz-plane intersect the chief ray before reaching the 
sagittal image, while the marginal rays in the xz-plane reach the chief ray after 
passing through the tangential image. In terms of transverse aberrations at the two 
images, the magnitudes are the same but the signs are opposite. 

Although the details are not given here, it is worth noting that choosing s’ as 
the midpoint between the line images leads to the result that A, = —Aj, with each 
one-half as large as the values given in Eqs. (5.2.3) and (5.2.4). A look at Fig. 5.2 
midway between the sagittal and tangential images shows that this result is 
expected. A series of spot diagrams for an image with astigmatism as its only 
aberration is shown in Fig. 5.3. As expected, the image blur midway between the 
line images is circular in cross section and the images outside of the region 
between the line images are elliptical in cross section. 
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Fig. 5.3. Spot diagrams for image with astigmatism. The T and S images are to the right and left, 
respectively, of the astigmatic blur circle. Mirror has s = s' =R = 1000 mm, y= 100 mm. 


Because of the close relationship between A, and A}, it does not matter which 
s’ is chosen to characterize the astigmatism present in an image. Our choice is 
s’ = s; or A = 0, hence Eq. (5.2.3) is the relation used in subsequent discussions. 

The direct way in which A, is a measure of the astigmatism is seen in a 
comparison of Eqs. (5.2.3) and (5.1.10), from which it follows that 


As’ /s? = —24,/n'. (5.2.5) 


Using Eq. (5.2.5) it is a simple matter to derive an expression for the transverse 
astigmatism at the sagittal image. Defining the transverse astigmatism (abbre- 
viated TAS) as one-half the length of the line image, we find from the geometry 
of Fig. 5.2 that 


TAS = —y(As'/s’) = 2A,ys'/n’, (5.2.6) 


where TAS < 0 at the sagittal image in Fig. 5.2 when y > 0, as required by the 
sign convention. The diameter of the astigmatic blur circle midway between 
the line images is |TAS|. Using the mirror parameters given in Fig. 5.3, 
with 9 = 0.5 °, we get As’ = —152 um and TAS = —15 um for the images 
shown. 

The final coefficient to consider is Ag, which, from Eqs. (5.1.5) and (5.1.6), is 


Ag = —(n’ sin 0’ — n sin 6). (5.2.7) 
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Instead of setting A) = 0, which defines the exact path of the chief ray, we follow 
the standard procedure of finding the angular difference in direction after 
refraction between the actual and paraxial chief rays for plane object and 
image surfaces. In the steps that follow, note that the usual statement for the 
paraxial approximation n’6’ = nô is not sufficient. In order to preserve the exact 
direction from which the chief ray approaches the center of the stop, and the 
object and image surfaces as planes, it is necessary to express sines and tangents 
of angles through third order. 
From Egs. (5.1.4) we get tan 0 = h/zy and tan 6’ = h'/z). To third order 
h 1 kW 


1 
9=—- 6 y=— 26" 2 
Sah | ib z 30", (5.2.8) 


where 0 = tan~'(h/zp), 0’ = tan7!(h'/z)). We now expand Eq. (5.2.7) to third 
order, substitute Eqs. (5.2.8), and find 


h nh ç Ë n? 
A = aaen 1 
: "zo "z “2 ( a) 


3 2 
2 -a5 (1 -5) (5.2.9) 


where the first two terms cancel by Eq. (2.2.7). Note that s and s’ in Eq. (2.2.7) are 
Zo and z) in the notation of this chapter. Note also that Ay = 0 for n = —n, a 
reflecting surface. 

At this point let us summarize our findings. We have relations for the 
aberration coefficients and, in the case of A,, have its relationship to a transverse 
aberration. The next step is to find the connection between the remaining 
coefficients and their respective transverse aberrations. This is done by first 
establishing the connection between nonzero terms in Eq. (5.1.6) and deviations 
of the wavefront converging on the image point from the spherical shape 
produced by a perfect system. 





5.3. RAY AND WAVEFRONT ABERRATIONS 


An optical system free of aberrations takes light from an object point Q, for 
which the wavefront is a sphere with center at Q, and images it at the Gaussian 
image point Q’. The wavefront of the light converging toward Q’ is a sphere 
whose center is at Q’, and the OPL along any ray through the system is constant. 
Thus ® in Eq. (5.1.6) is zero. This spherical wavefront is taken as our reference 
and designated %,. 
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Fig. 5.4. Cross sections of reference and aberrant wavefronts, Ł, and È,, respectively. Radius of 
curvature of the reference wavefront is s’. 


For a system with aberrations the wavefront converging toward Q’ is no longer 
spherical and, depending on the sign of ®, is either advanced or retarded at each 
point on the wavefront. A schematic cross section of an aberrant wavefront 
designated &,, is shown in Fig. 5.4, with it and the ideal wavefront È, in contact 
at their centers where ® = 0. At any other point on the actual wavefront, ® is the 
OPD between È, and &,. The geometrical distance along any ray between the 
wavefronts is ®/n’ and designating this distance as A we have 


A = PY) L 26,9) — Ea yh (5.3.1) 


For the situation shown in Fig. 5.4 we have n’ > 0, hence A and ® have the same 
signs. When A > 0 the actual wavefront is retarded with respect to the reference 
wavefront; the actual wavefront is advanced when A < 0. 
Differentiating Eq. (5.3.1) gives 

ðA lam 1/02, Od, 

pay a ay r) 
with a similar relation in which x replaces y. The quantity in parentheses in Eq. 
(5.3.2) is the difference in slopes between the reference and aberrant wavefronts 
in a slice parallel to the yz-plane. Because rays are perpendicular to wavefronts, 
this is also the difference between the slopes of the ray for a perfect system and 
the actual ray, each at point (x,y) on the respective wavefronts. Given this 
difference in slopes, there is a consequent transverse aberration in the y-direction 
at the image, as shown in Fig. 5.5. A similar result follows in the x-direction from 
Eq. (5.3.2) with x in place of y. From the geometry in Fig. 5.5 we get 

ðA s' a® ,ðA s' d® 


TA = G TA — SSS 5.3.3 
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(5.3.2) 
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TAy 





Fig. 5.5. Geometry of wavefront slope difference and transverse aberration in the image. See Eq. 
(5.3.3). 


where the subscripts x and y on TA denote transverse aberrations in the x- and 
y-direction, respectively. 
Substituting Eq. (5.1.6) into Eqs. (5.3.3) gives 


t 

TA, = ~ [Ao + 2Ayy + An? + 332) + 44g yr], (5.3.4) 
s 

TA, =— [24x + 2A>xp + 44,27’). (5.3.5) 


We have now established the connection between the aberration coefficients and 
the geometrical transverse aberrations. The corresponding angular aberrations, 
denoted by AA, are given by 


TA,=s/AA,, TA, =s’AA,. (5.3.6) 


We now examine briefly the specific aberrations in Eq. (5.3.4) and relate each to 
the corresponding wavefront aberration. 

The term with A, is a measure of spherical aberration. As noted in the 
discussion following Eq. (4.3.6), we replace r with y for third-order aberrations. 
The resulting TSA3 changes sign when y changes sign, as shown clearly in Fig. 
4.5. The corresponding wavefront aberration map is shown in Fig. 5.6, where the 
concave surface represents the advance of the spherically aberrant wavefront 
relative to the reference wavefront. The z-axis of the optical system in Fig. 5.6 is 
directed vertically upward and passes through the center of the diagram, origin O 
in Fig. 5.5 is at the center of the surface in Fig. 5.6, and the x-axis is to the right. 
Note that an unaberrated wavefront is a plane in this type of map, with the 
deviation from this plane proportional to A of Eq. (5.3.1). 

The term in Eq. (5.3.4) with A, represents astigmatism, as discussed in the 
previous section, and the wavefront aberration map for the circular astigmatic blur 
in Fig. 5.3 is shown in Fig. 5.7. Note that this map shows a wavefront that is both 
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WAVEFRONT FUNCTION 


Fig. 5.6. Wavefront aberration map for image with spherical aberration. The image is located at 
the paraxial focus. See Figs. 4.5 and 4.6 for ray and spot diagrams. 


advanced and retarded. The portion that is advanced is higher than the center, 
while the retarded part is lower. A useful exercise for the reader is to correlate the 
ray directions in Fig. 5.2 with the shape of the wavefront map in Fig. 5.7. 

The coefficient A, is a measure of coma. A sketch showing the asymmetric 
form of this aberration is given in Fig. 5.8. Note that the marginal rays on the 





WAVEFRONT FUNCTION 





Fig. 5.7. Wavefront aberration map for the circular astigmatic image in Fig. 5.3. 
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Fig. 5.8. Sketch of comatic Image profile. Here TC and SC are tangential and sagittal coma, 
respectively. TC =3-SC. 


y-axis meet at a point three times farther from the Gaussian image, compared with 
the corresponding point for the marginal rays on the x-axis. The source of this 
difference between the tangential and sagittal coma is evident by inspection of 
Eq. (5.3.4). Coma is fully specified by giving either transverse sagittal coma 
(TSC) or transverse tangential coma (TTC), with TTC = 3 TSC. 

The distribution of light rays over the comatic image is not uniform, there 
being a greater density of rays near the point of the comatic image. A spot 
diagram for a comatic image is shown in Fig. 5.9 with the chief ray at the point of 
the image. About 80% of the energy is within a distance equal to TSC from the 
Gaussian focus. Unlike the case of spherical aberration, a shift along the chief ray 
does not improve the image quality. The wavefront map for a comatic image is 
shown in Fig. 5.10. (For clarity the map in Fig. 5.10 is rotated by 90° relative to 
the spot diagram in Fig. 5.9, with the y-axis to the right.) In this case the 
asymmetry is clearly shown by the general downward slope from left to right in 
Fig. 5.10. A careful study of ray directions in Fig. 5.8 and slopes in Fig. 5.10 
(suitably rotated) is suggested. 


5.3. Ray and Wavefront Aberrations 83 














* P 
Tog? 


aay 





i 
3 


: i THROUGH FOCUS SPOT DIAGRAM 





Fig. 5.9. Through-focus spot diagrams of image with coma. 


The remaining coefficient Ay is a measure of distortion. If Ay is nonzero the 
effect is to displace the chief ray and change the position of the final image but 
not its quality. Any straight line in the object plane that does not pass through the 
z-axis is imaged as a curved line if distortion is present. Thus a square in the 
object plane centered on the z-axis will appear distorted in the image plane. If 











WAVEFRONT FUNCTION 


Fig. 5.10. Wavefront aberration map for the comatic image in Fig. 5.9. The map is rotated by 90° 
relative to the spot diagram in Fig. 5.9. 
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Ao > 0 the reimaged square resembles a pincushion; if Ay < 0 the result is barrel 
distortion. See Born and Wolf (1980), or Welford (1986), for schematic diagrams 
of distortion. 

For a mirror the paraxial equation is exact, and hence there is no distortion 
when the stop is at the surface. Distortion generally is present when the stop is 
displaced from the surface of a mirror. 

Finally, we have not yet considered curvature of field, the final aberration 
noted in Section 4.4. Characteristics of this aberration are given in Section 5.7 of 
this chapter. 

For more details on the nature of aberrations and wavefront maps, the reader 
should consult Born and Wolf (1980), and Welford (1986). An especially 
thorough discussion of aberrations is given by Mahajan (1998). 


5.4. SUMMARY OF ABERRATION RESULTS, STOP AT SURFACE 


It is now appropriate to bring together all of the important results on aberrations 
and present them in a set of tables for convenient reference. Following are two 
tables of aberration coefficients, Table 5.1 for a general refracting surface and Table 
5.2 for a reflecting surface. Because many of the applications considered in 
subsequent chapters involve single or multiple mirror systems, it is convenient 
to include a separate table for mirrors. The next two tables summarize the results 
for transverse aberrations, Table 5.3 for a general refracting surface and Table 5.4 
for mirrors. Explanations and definitions are included as needed. 

All of the results in this section apply specifically to the case where the 
aperture stop is at the surface. Because there are no optical surfaces either 


Table 5.1 


Aberration Coefficients for General Surface?’ 


Ifa /1 iV ı K 
A; =—- - b 
3 AG (G z) Ga a) te 2a | 
R1 1l 1 1l 
A, =0 = 
2 (: az =) 











* Entrance pupil is at surface. 
? For a spherical refracting surface with no aspheric compo- 
nent, the last two terms in 4, are absent. 
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Table 5.2 


Aberration Coefficients for Mirror Surface”? 


n m+1\" b 
As = gpa [K+ (244) | go do? 


nO (m+1 n8? 
A= - (244), ALS ee 





4 Entrance pupil is at surface. 
>The following relations apply to a mirror: 
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Table 5.3 


Transverse Aberrations for General Surface®? 
If,fl fi 1\ K, ps 
RAS -zf (Fa) (aza) tE" -oroj 


nl 1 1 1)\36s' 1 
te =5(+-3) (4-2) 5 =a G 
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* Entrance pupil is at surface. 
? Angular aberrations are given by the above relations with the 
final s’/n’ divided out. 


Table 5.4 


Transverse Aberrations for Mirror Surface* 


3 2 
TSA =—~ [a+ (24) Joo Be 





R3 m—1 2n 
y (m+ 1 
T =i. | —— }@s’ =~TTC 
See neat 
2 
TAS = — es, TDI =0 


* Entrance pupil is at surface. 
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preceding or following this surface, the entrance and exit pupils are also located 
at the surface. 


5.4.4, DEFINITIONS, CHARACTER OF ABERRATIONS 


Measures of the transverse aberrations are taken from Eq. (5.3.4). For 
completeness a summary of the designations follows. Each aberration is desig- 
nated by two letters: spherical aberration, SA; sagittal coma, SC; tangential coma, 
TC; astigmatism, AS; and distortion, DI. If the aberration is transverse, a prefix T 
is attached; if the aberration is angular, a prefix A is attached. 

With these designations the transverse aberration expressions are: 


= 4A,y's! Ays 
n! 


n! 


Ags’ 
n ` 


TSA . TSC= 





, TAS= , TDI= 





2A ys’ 
A (5.4.1) 
Note that TSC is simply one-third of the tangential coma, where the latter is based 
on the rays from the y-axis. All of the aberrations are computed using rays from 
the y-axis, with the full aberrations given by Eqs. (5.4.1) when the radius of the 
surface is substituted for y. 

It is not necessary to use Eq. (5.3.5) to find transverse aberrations in the x 
direction because A} is zero, given our choice s’ = sų, and the extent of the blur in 
the x direction is known from results of Eqs. (5.4.1). Thus all measures of 
transverse aberrations given in what follows are in the y direction, that is, 
measured in the plane defined by the chief ray and z-axis. 

Results obtained by substituting from Tables 5.1 and 5.2 into Eqs. (5.4.1) are 
given in Tables 5.3 and 5.4. Although As’ is actually a longitudinal aberration, its 
relation to TAS is included in Table 5.3 for completeness. All of the relations in 
Tables 5.1 and 5.2 include the sign convention and thus these equations give 
information about the character of the aberrations as well as their magnitudes. A 
brief summary of the relation between the sign of the aberration coefficient and 
image character follows, where the choice of sign is in accord with the figures 
illustrating each of the aberrations. With 6 > 0, 


A; <0: marginal rays cross chief ray between surface and Gaussian focus, 

A, > 0: coma flare is directed away from z-axis, or Gaussian focus between 
flare and axis, 

A, <0: tangential line image closer to surface than sagittal image, and 

Ag > 0: pincushion distortion. 


For some purposes the sign of the aberration is of no consequence and the 
magnitude is all that matters. In terms of magnitudes, each of the aberrations has 
the following interpretation: 
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[TAS] = half-length of astigmatic line image 
= diameter of astigmatic blur circle, 
3|TSC| = length of comatic flare 
= 1.5 x width of comatic fiare, 
|TSA| = radius of blur at paraxial focus 
= 2 x diameter of circle of least confusion. 


All of these results assume, of course, that only a single aberration is nonzero. If 
more than one aberration is present in an image, there is no simple way to 
characterize the blur character or dimensions. 

Inspection of the entries in Tables 5.1 and 5.2 shows that the aberration 
coefficients are independent of the direction of incident light on a given surface. 
Reversing the direction of incident light is equivalent to taking Fig. 5.1 and 
reversing it left for right, thus changing the signs of n, n’, s, s’, R, and 0. Thus the 
transverse aberrations are also the same for either direction of incident light. This 
result is expected because the direction of the incident light, left to right or vice 
versa, cannot change the character of the image. 


5.4.6. APLANATIC CONDITION AND OTHER EXAMPLES 


We now examine the various terms in Tables 5.1 and 5.2 to find examples of 
surfaces that have specific aberration characteristics. For a spherical surface with 
b =0 the aberration coefficients A, through A; in Table 5.1 are zero when 
n's’ = ns. For a mirror n’ = —n and this condition is satisfied with s’ = —s. Using 
the paraxial mirror equation (2.3.2) we find R = œ, hence the surface is a plane 
mirror. This result is expected but not especially useful because a plane mirror has 
zero power. 

For a spherical refracting surface the condition n's’ = ns, together with the 
Gaussian equation (2.2.2) gives 


ns =n's' = R(n+n’). (5.4.2) 


This defines the object and image positions for a so-called aplanatic sphere, 
where the term aplanatic means the system has zero spherical aberration and 
coma. A lens of this type is often used as the first element in high-power 
microscope objectives. It has also been used as an element near the focus of a 
Schmidt camera in a spectrograph to shorten the camera focal length, as noted by 
Bowen (1960). In this application its chromatic aberration is not a serious 
constraint in getting good image quality. 

A paraboloid (K = —1) in collimated light (m = 0) has zero spherical 
aberration but nonzero coma and astigmatism. Thus this type of mirror, though 
perfect on-axis, has a limited field of view (FOV) when used as a telescope. 
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Further discussion of the image characteristics of a paraboloid is presented in 
Chapter 6. 

For a conic mirror with b = 0, the condition for zero spherical aberration fixes 
K in terms of m. Setting 4, = 0 in Table 5.2 gives the relation in Eq. (3.5.4), a 
result used in Section 4.5 to establish the conic constant for the secondary mirror 
of a classical Cassegrain telescope. Note that such a mirror has coma and 
astigmatism. 

From the entries in Table 5.2 we also see that a sphere (K =0) in a 
configuration with m = —1, thus s’ = s = R, has zero spherical aberration and 
coma, hence is an aplanat. 

As a final example we point out that a sphere in collimated light has nonzero 
aberrations both on- and off-axis. Although this would appear to limit the 
usefulness of a spherical mirror, our discussion in Chapter 4 shows that this is 
not the case when the aperture stop is separated from the mirror. This is our topic 
for the next section. 


5.5. ABERRATIONS FOR DISPLACED STOP 


We now determine the aberration coefficients for a single surface with the 
aperture stop displaced from the surface, as shown in Figs. 5.11 and 5.12. In Fig. 
5.11 the stop defines the light bundle before refraction at the surface, and the 
entrance pupil coincides with the stop. In Fig. 5.12 the stop follows the surface 
and the entrance pupil, or image of the stop, is separate from the stop. In both 
figures the chief ray is directed toward the center of the pupil at angle y with the 
z-axis and intersects the surface at height L. 


stop 





Fig. 5.11. Portion of Fig. 5.1 with aperture stop displaced from surface. The chief ray makes angle 
y with the z-axis and intersects the surface of height L. The entrance pupil is at the stop. The relation 
between parameters is given in Eq. (5.5.2). In this diagram W < 0, L > 0, y and 0 > 0. 
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stop 





Fig. 5.12. Repeat of Fig. 5.11 with stop to right of surface. The stop is reimaged as the entrance 
pupil EP. Equation (5.5.2) applies when W’ replaces W. In this diagram L < 0, W and W’ > 0. 


Comparing these figures with Fig. 5.1 we see that an arbitrary ray that met the 
surface at (x, y) in Fig. 5.1 now meets the surface at (x, y + L). Because a different 
portion of the surface refracts the light bundle from Q when the stop is displaced, 
it is expected that the image aberrations will differ from those derived for Fig. 5.1. 
An example of this difference was previously noted in Section 4.5, where the 
absence of coma and astigmatism for a sphere with stop at the center of curvature 
of the sphere was the basis for the Schmidt camera. These aberrations are not zero 
when the stop coincides with the surface, as is evident from Table 5.2. This one 
example illustrates the importance of the stop position in controlling or eliminat- 
ing aberrations. 


5.5.a. STOP-SHIFT RELATIONS 


We now proceed to find the general aberration relations for a displaced stop. 
The procedure is simply one of putting y+ JZ in place of y in Eq. (5.1.6). 
Collecting terms in various powers of x, y, or r, and dropping all constants 
independent of these variables, we get 

® = (Ay + 2LA, + 317A, + 4L°A3) 
+y (4, + 3LAy + 6L7A3) +x (4, + LAS + 21743) 
+y (4z + 4LA3) + x° (Ay + 4243) + Agr 
= By + By + Bix +B (y +y) + Bsr, (5.5.1) 
where 4, = A, is used to combine the cubic terms. 


Even before deriving the explicit form of the B;, there are several important 
statements that follow from Eq. (5.5.1). These stop-shift relations are: 
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(1) L does not appear in the r*-term and thus the spherical aberration 
coefficient is independent of stop position. 

(2) If A; = 0, the coma coefficient B, is independent of the stop position and 
the coma is that given in Section 5.4. 

(3) If both 4, and A, are zero, B, and B; are independent of the stop position, 
and reduce to A, and A}, respectively. 


The importance of these statements will become evident when specific systems 
are discussed in the chapters that follow. Although they are deduced here for a 
single surface, it turns out these statements also apply to a system made up of 
many surfaces. 


5.5.b. ABERRATION COEFFICIENTS 


The process required to evaluate each B; is a straightforward one, though the 
algebra is a bit messy at times and therefore omitted in the discussion to follow. 
When the evaluation is more than simple substitution, as for astigmatism, a brief 
outline of the procedure is given. 

To begin, we note the following relations, valid in the paraxial approximation, 
derived from the geometry in Fig. 5.11: 


L=-Wh, 6=v[1-(W/s)], (5.5.2) 


where W is the distance from the surface to the entrance pupil, and both Z and W 
are governed by the same sign convention as other distances. Figure 5.12 shows 
the geometry for the case where the stop follows the surface. In this case Eq. 
(5.5.2) applies if W is replaced by W”. A glance at Eq. (5.5.2) and Fig. 5.11 shows 
that, for a given 0, w increases in size as s approaches W. There will come a point 
where y is large enough to make the paraxial result in Eq. (5.5.2) invalid, and 
results derived using Eq. (5.5.2) will be incorrect. Unfortunately, no simple 
statement can be made about where this breakdown occurs, and one must check 
on a case-by-case basis as to the validity of results derived from third-order 
aberration theory. Exact ray-tracing is generally used to check the results. 

A final point to be made is in the choice of parameters used in presenting the 
aberration coefficients. The choice made here is to eliminate L and 0, and to give 
the results in terms of W, the entrance pupil position, and w, the chief ray angle. 

With these preliminaries behind us, we proceed with the results; Table 5.5 
gives the coefficients for a general surface, while Table 5.6 gives the results for a 
mirror. The spherical term B, is taken from Tables 5.1 and 5.2 and is included 
here for completeness. The term B, is derived from Eq. (5.5.1) and the entries in 
Tables 5.1 and 5.2 by direct substitution, while By follows from Eq. (5.5.1) and a 
procedure similar to that used in arriving at Eq. (5.2.9). 
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Table 5.5 


General Aberration Coefficients, Centered Pupil? 


2 
B, = -5("(¢-3) + jul’ —n) +5] 
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a Entrance pupil at distance W from surface. 











The derivation of B, is not one of direct substitution of the A; but involves 
going through the steps analogous to those used in Sections 5.1 and 5.2. First we 
note that 


B,=4,+31Q, BB, = A, +9, (5.5.3) 


where Q = A, + 2LA;, and A, and A) are the multiplying factors of y? and x’, 
respectively, in Eq. (5.1.5). 


Table 5.6 


Mirror Aberration Coefficients, Centered Pupil* 
n m+1 2 b 
B= ae [k+(242) | 8 
(WW) K m+1 1 1 b 
B, = -n F Rt mi) F R +5) 
_ WWIK f(t 1\] è 5 
Bi=n p |R WTR PAAA) 
K 1/1 1 1 1 1 1 b 
3 _ 3 
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2 Entrance pupil at distance W from surface. 
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The choice of B, = 0 or B, =0 locates the tangential or sagittal images, 
respectively. Corresponding to Eqs. (5.1.8) and (5.1.9) we find 


n' cos? 6’ ncos?O _n' cos 6! —ncos8 
s, osoo R 

n! n n cosh —ncos0 

s R 





6LQ, (5.5.4) 





2LQ. (5.5.5) 


The relation analogous to Eq. (5.1.10), through terms in 6”, is 


Be (=) Bee (5.5.6) 
W=0 


s2 g2 n! 


where the first term to the right of the equal sign is given in Eq. (5.1.10). 

The next step is to choose an image at which to evaluate the astigmatism and, 
as in Section 5.2, the choice is the sagittal image. Solving Eq. (5.5.5) for s’, 
substituting the result into B} in Eq. (5.5.3), and evaluating to second-order in 0, 
gives 


tA f 
B, =4,+20=-5(5) + 2LQ. (5.5.7) 
2\ s? 
w=0 
Note that 4, in the first part of Eq. (5.5.7) is the value at the sagittal image, hence 


different from A, in Eq. (5.5.3). From a comparison of Eqs. (5.5.6) and (5.5.7) we 
find 


As'/s? = —2B,/n'’, (5.5.8) 


a relation corresponding to Eq. (5.2.5). Thus B, is a measure of the astigmatism 
when the entrance pupil is not at the surface. The entry for B, in Table 5.5 is 
obtained by evaluation of Eq. (5.5.7). 

The relations between the aberration coefficients and the transverse aberrations 
are similar to those given in Eqs. (5.4.1), but with B, replacing 4;. 








4Bsy's! mA 
rsa = BY! qsc = Be lrc, 
ie E 3 (5.5.9) 
TAS=—" m, 
n n 


The full aberration is obtained from Eq. (5.5.9) when y is replaced by the height 
at which a marginal ray from Q on the z-axis intersects the surface. Tables 
analogous to 5.3 and 5.4 are left to the reader. 
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5.5.c. EXAMPLES 


At this point it is appropriate to illustrate the aberration relations with two 
examples. Our choices are a sphere and a paraboloid, each illuminated with 
collimated light, hence m = 0. We assume no aspheric component and set b = 0. 

For the sphere we find 4, = B, = n/4R?, hence we expect that both B, and B, 
will depend on the stop position. Inspection of the coefficients in Table 5.6 shows 
that this is the case. When W = R, both B, and B, are zero; this stop position is 
the starting point for the Schmidt camera. 

Taking a sphere in collimated light with an aspheric component, we find 
B, = 0 when b = 2n/R?. This aspheric component could be put directly on the 
mirror, but this does little good because now the mirror has nonzero coma 
independent of W. The solution, of course, is to put the aspheric component on 
another optical element located at W = R, as already shown in our discussion of 
the Schmidt camera. The discussion of how aberrations are calculated for systems 
with many surfaces is the topic of the next section. 

Consider now the parabola with B, = 0. Setting m = 0 and K = —1 in B, we 
see that the coma is independent of W. This is the expected result, based on the 
stop-shift relations following Eq. (5.5.1). Because coma is not zero the astigma- 
tism coefficient depends on W, hence a proper choice of W will mean zero 
astigmatism. From B, in Table 5.6 we see that this choice is W = R/2, hence the 
stop is at the focal surface. 


5.6. ABERRATIONS FOR MULTISURFACE SYSTEMS 


5.6.a. GENERAL FORMULATION 


The real power of the approach to aberrations using Fermat’s Principle is 
particularly evident when systems with many surfaces are analyzed. For any 
surface in such a system, say the ith one, the object and image are at Q, and Q,’ 
located at distances s; and s,’, respectively, from the surface. Between the object 
and image the OPD between an arbitrary ray and the chief ray is given by Eq. 
(5.5.1), where W; is the position of the entrance pupil for the surface. If this same 
ray is followed from the original object to the final image, then the OPD for the 
system is 


©,=0,+0,+---+0,= =O, (5.6.1) 
where the subscript f denotes the last surface. Each term in Eq. (5.6.1) can be 


replaced by Eq. (5.5.1), with the appropriate (x, y) at each surface. Before making 
this substitution, note that a complete description of the aberrations according to 
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Eq. (5.4.1) is obtained from rays in the yz-plane only. Therefore we set x = 0 in 
Eq. (5.5.1), and the system OPD for rays in the yz-plane is 


D, = D (Boyi + Buy? + Buy, + By?) 
=») (gra), j=0,1,2,3, (5.6.2) 
7\G 
with the geometrical distance along an arbitrary ray between the actual and 
reference wavefronts at the final surface given by 
A = ®,();)/ny. (5.6.3) 


To find the transverse aberration at the final image, we proceed along the lines 
followed in going from Eq. (5.3.1) to Eq. (5.3.3), but do so with only one of the 
aberration terms, say the jth one. With reference to the last surface we get 


3A 1 ® 1 ; OY; 
— =A IUE By —. (5.6.4) 
dyp — My Oyy n; y 2 Oyy 


The partial derivative in Eq. (5.6.4) is easily evaluated with the aid of Fig. 5.13, 
where two rays from an intermediate axial object point are shown passing through 
several surfaces. Because each Q; is imaged to Q,’, the ratio of the differential 
change in y; to that of any other y, say the fth one, is simply the ratio y,/y,. 
Substituting this into Eq. (5.6.4) and multiplying by Shs we can write the jth 
transverse aberration in the y-direction as 





J , j+1 
Ss IP OS, ZA 
TA, = 2 — = “(7-4 1] > BY 2 y (5.6.5) 
” ne Ayy n 2 N Yy 4 
or 
S. ; 
TAy = 37 + DBs yy. (5.6.6) 
S 


In Eq. (5.6.6) the representation of the sum in brackets in Eq. (5.6.5) is reduced to 
a single symbol, with the subscript s denoting a system aberration coefficient. 
Note the close correspondence between Eq. (5.6.6) and each of the terms 
containing y in Eq. (5.3.4). 

Calculation of the transverse aberration using Eq. (5.6.6) is based on the 
marginal ray height at the last surface. In many cases it is convenient to express 
the transverse aberration in terms of the marginal ray height at some other 
surface, such as at the system entrance pupil. This is easily done by multiplying 
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Fig. 5.13. Paths of two adjacent rays from Q; through several surfaces, where Q, is an 
intermediate axial object point. Differential heights dy; x y; at each surface. 


and dividing Eq. (5.6.5) by the ray height at another surface, say the ith one, 
raised to the j + 1 power. The results are 


A j+l 
3, =y) : (5.6.7) 
i Yı 
Sf (2) > j 
TAy == (7 JU + DBs y. (5.6.8) 
My nl. \ yy js Y 


Note that the terms in Eq. (5.6.7) depend on the choice of 1 but TA,, is, of course, 
independent of this choice. 

The formalism needed to calculate third-order aberrations for a multisurface 
system is now complete. The necessary aberration coefficients are in Tables 5.5 
and 5.6, and it is simply a matter of computing each one surface-by-surface and 
substituting into Eqs. (5.6.7) and (5.6.8). 

The relations given in Section 5.4 between the signs of the aberration 
coefficients and the character of an image also hold for the coefficients in Eq. 
(5.6.7) and the transverse aberrations in Eq. (5.6.8). 


5.6.b. EXAMPLE: ABERRATION COEFFICIENTS OF TWO-MIRROR 
TELESCOPES 


As an example of the procedure in Section 5.6.a, consider either the 
Cassegrain or Gregorian telescope shown in Fig. 2.7. We assume the stop is at 
the primary, thus its aberration coefficients can be taken from Table 5.2. Setting 
n = l,m = 0, and b = 0 gives 


@ 0 


1 
Bo = 0, B = 7, By, =a: B, = — (K; + 1). (5.6.9) 
1 R R 1 AR} 
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For the secondary n = —1, b = 0, and w = —@ (from the law of reflection 
i’ = —i). From Table 5.6 we get 


K, 1/1 1 oe SS OS | 
Ba = -(Woy| 34+ —( —-— }(—-++=-—}]. 
i le ale rales RW a) 


p -TOR (1 1y? 
2 R [R W R)I 








5.6.10 

By, -AR (r+1\(1_1 een) 
22 RE |R \m—-1)\ WR) | 

1 m+1\? 
By = —— | K. —_ . 
£ l ao 

With y, = yı, the marginal ray height at the primary, we get 
B; = Ba t+Brbr/y';  j=0,1,2,3, (5.6.11) 


where the subscripts 1, 2, and s on the B, refer to the primary, secondary, and 
telescope, respectively. 

In terms of the normalized parameters defined in Chapter 2 for a two-mirror 
telescope we have k = y,/y,, p = R,/R,, and W = (1 — k) fi = —(1 — k)R, /2. 
For spherical aberration we then get 


1 kA m+1 i 
=— {K =n —— : 6.12 
Buna hit [met G ean 


Note that spherical aberration is zero when the expression in braces is zero, a 
result previously given in Eq. (4.5.3). It was derived there by starting with a 
classical Cassegrain and “bending” the mirrors subject to the requirement that 
Fermat’s Principle be satisfied. 

Given Eq. (5.6.12) it is now a simple matter to find TSA3. Using Eqs. (2.5.3) 
and (2.5.7) we find s, = mkf, = kf . Putting this and n, = | into Eq. (5.6.8) gives 


3 
TSA3 =1(2) { =~ gpa }, (5.6.13) 





where the quantity in braces is that in Eq. (5.6.12), and F; is the focal ratio of the 
primary mirror. Note that the sign of R, makes the factor outside of the braces in 
Eqs. (5.6.12) and (5.6.13) negative. 

Expressions for the other aberrations using Eqs. (5.6.9) through (5.6.11) are 
determined using the same procedure, but this development is left for Chapter 6 
where the characteristics of telescopes are explored in detail. 
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5.7. CURVATURE OF FIELD 


The remaining third-order aberration to be considered is that of curvature of 
field. As noted in Chapter 4 this aberration does not affect the image quality, but 
given the usual case of a flat detector can adversely affect the image definition 
over an extended field. For the Schmidt camera, for example, the focal surface has 
a radius of curvature equal to the camera focal length. Matching the detector to 
the focal surface requires either deforming it to the proper radius or using another 
optical element to “flatten” the field. The former method is used with most large 
Schmidt telescopes by bending photographic plates. The use of a field-flattener 
lens is discussed later in this section. 

We now consider the situation shown in Fig. 5.14 where an optical surface 
whose vertex is at the origin of the (x, y, z) coordinate system images the curved 
object surface È into a curved image surface X’. The surfaces È and X’ have radii 
of curvature r and r’, respectively, with the sign convention for each the same as 
for a surface radius of curvature, thus r > 0 and r’ < 0 in Fig. 5.14. As a final 
definition, let x denote the curvature of the image surface, with x = 1/r’. It 
should be noted here that our sign convention for r and r’ is opposite that of Born 
and Wolf (1980), but we choose to preserve its universal character. 

The diagram in Fig. 5.14 can apply to any individual optical surface within a 
multi-surface system, where © is an intermediate object surface and X’ its 
conjugate surface. In the discussion to follow we will not designate these surfaces 
with specific indices, but will include them at the end as needed. 

The procedure is one of finding s’ in terms of @’ and applying a general relation 
between x and s’ to get the curvature of X’. A glance at Eqs. (5.5.4) and (5.5.5) 
shows that s’ can contain only even powers of 8 and @’ when expanded in a power 





Fig. 5.14. Curved object surface È imaged to surface X’. The radii of curvature of the object and 
image surfaces are r and r’, respectively. The adopted sign convention has r > 0 and y’ < 0 in the 
diagram. 
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series. Thus we can solve these equations for s’ and, after substituting n0 = n'O’, 
get s’ through second order in the form 


1/s’ = ay + a0”. (5.7.1) 


With this form of s’, it can be shown that the curvature x to zeroth order is given 
by 


k= —(a + 2a), (5.7.2) 


hence x is constant and to this approximation the image surface is a section of a 
sphere. (The relation between x and s’ in polar coordinates from which Eq. (5.7.2) 
is derived can be found in the Mathematics Manual by Merritt, 1962.) 


5.7.a. PETZVAL SURFACE 


We first determine the curvature of a special surface called the Petzval surface. 
This surface is the image surface in the special case where the astigmatism is 
zero, hence s; = s, = s,, where s, is the distance from the origin in Fig. 5.14 to 
the Petzval surface. We can find s, from either Eq. (5.5.4) or (5.5.5). 

Given the condition that As’ = 0 in Eq. (5.5.8) we find 2LQ = —A, from Eq. 


(5.5.7). Substituting this into Eq. (5.5.5) gives 


1 A 
nonn cop G mioon pe (5.1.3) 
S S R 
To put Eq. (5.7.3) into the form required by Eq. (5.7.1) means the usual power 
series substitutions and using n = n’@’ to eliminate 0. In addition to the angles 
that appear explicitly in Eq. (5.7.3), the distance s depends on @. The relation 
between s and 0 is found using the geometry in Fig. 5.15, where the sag u of È is 


eee _s?sin’ 0 
w Xr 





= $ cos 6 — Zo, 


and, solving for s, leads to 
1 l ne? 1 re 
Ss z% ma r) 


Substituting this result for s in Eq. (5.7.3) and collecting terms gives 








=——+ +—| —— -—— -— 


5.7.4 
Sp HZ wR 2 ( ) 


n'nR nz nr 


1 n n—n 0? [= -nf n wW | 
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Fig. 5.15. Sag u of the object surface, with u = y”/2r. See the discussion following Eq. (5.7.3). 


Equation (5.7.4) is in the form shown in Eq. (5.7.1) and we now find x = 1/r’ 
from Eq. (5.7.2). The result is 


1 -_ 
r-z- (5.7.5) 
nr nr n'nR 


The importance of Eq. (5.7.5) lies in the fact that the curvature of the Petzval 
surface does not depend on the distances s and s’, nor on the position of the 
entrance pupil for this surface. This relation applies to each surface in a system 
and, given that n’r’ for the ith surface is nr for the (i + 1)st surface, leads to a sum 
over all surfaces given by 





1 1 n—n 
= 5.7.6 
ntp nir] = ( n'nR ) ( ) 


where 1 and f refer to the first and last surfaces, respectively. For a flat object field, 
the most common situation, we get 


n—n 


Kp = “ny (=) : (5.7.7) 


Thus for any optical system for which the object field is flat, the Petzval surface is 
an invariant surface. If the system has astigmatism, each of the astigmatic image 
surfaces will have its own curvature. But, as we now show, there are definite 
relations between these curvatures, the amount of astigmatism, and the Petzval 
curvature. 
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5.7.b. CURVATURES OF ASTIGMATIC SURFACES 


The procedure to find these curvature relations starts with the substitution of 
Eq. (5.2.5) into Eq. (5.5.6), giving 


As! too 2 
S _ 8s * = ~=(4, +210). 
n 





s siS, 
Solving for 2LQ and substituting into Eq. (5.5.5) gives 
n! n n'cos@’ —ncos0 5 (=) 
Ss 


age wp ee iri S EEA 
s ra R tA ta 





Tal 
Ssst 


where we see by comparison with Eq. (5.7.3) that the first three terms to the right 
of the equals sign are simply n’/s,,. With this substitution we get 


25, =8,) = 8; 85 (5.7.8) 


where the factors in the denominator cancel because of their near equality. 
Equation (5.7.8) can also be written as 


Sp — S, = 3(8, — 85). (5.7.9) 


The geometric interpretation of Eq. (5.7.9) is a simple one; at a given height y the 
distance between the Petzval and tangential surfaces is three times the distance 
between the Petzval and sagittal surfaces, with the sagittal surface always between 
the other two. Because astigmatism is zero on-axis, the image surfaces are in 
contact where they intersect the z-axis. 

Note that Eq. (5.7.9) holds for any surface in an optical system and does not 
depend on the object distance for that surface, nor does it depend on the entrance 
pupil location. Hence it must also hold for the final image surfaces of a system 
with many surfaces, and the relations to follow are taken at the final surfaces. 

It is a simple matter to write Eqs. (5.7.8) and (5.7.9) in terms of surface 
curvatures. From the geometry in Fig. 5.16 we get 


Uy — Up = Sy — Sp (5.7.10) 


where u is the surface sag and « and f denote any pair of image surfaces. We also 
see that u = y’x/2 and thus 


2 (8, -s 
Ky — Kg = ( : £), (5.7.11) 


g? s2 


from which it follows using Eqs. (5.7.8) and (5.7.9) that 





Ks — K; = 2(K, — Ky), Kp — K; = 3(k, — Ks). (5.7.12) 


s p 
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Fig. 5.16. Sags of image surfaces « and f, where u, — ug = s, — sg in the paraxial approxima- 
tion. 


Choosing « = s and $ = ¢ in Eq. (5.7.11) and substituting Eq. (5.5.8) we get 
K, — K; = —4B,,/n'0”, (5.7.13) 


where B,, is the system astigmatism coefficient from Eq. (5.6.7) with 1 = f. Note 
that B,, is referenced to the last surface in the system because 6’ is specified at the 
last surface. 

It is now a simple matter to combine Eqs. (5.7.12) and (5.7.13) and solve for 
the curvatures of the individual surfaces. The results are given in Table 5.7. 
Included is an entry for the curvature x,, of the surface midway between the 
S- and Fsurfaces, that surface on which the astigmatic images are circular. 


5.7.c. EXAMPLES 


A few simple examples are now in order. Consider first a spherical mirror in 
collimated light, hence m = 0. From Table 5.6 we get 


2 2 
a=" (1-5), 
R R 


where y =0 from Eq. (5.5.2). For a single reflecting surface we have 
n! = —n, 0’ = —0. Substituting into the entries in Table 5.7 leads to 


2 22 wr? 2 6 Ww? 
£2) wat So (ie—), weseelie—s). 714 
PIR “TR al z) “=R al =) PD 


For W = R each of the curvatures in Eq. (5.7.14) is 2/R, as expected, because the 
astigmatism is zero. For W = 0 we find x, = 0, k, = —4/R. Thus the Petzval and 
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Table 5.7 


Image Surface Curvatures 


n—n 
= — A — 
Kp ni ( n'nR ) 





2B, 6B, 

yee ae) ik = pk on 
2 

K, +K Yi 

Km = a E, B= 5B(2) 


tangential surfaces have opposite curvatures with a flat sagittal surface between 
them. Although collimated light was specified in the forementioned, note that 
these results hold for any object distance because B, is independent of the 
magnification m. 

As a second example consider a Schmidt camera. As already noted here and in 
Section 4.5, a spherical mirror with W = R has zero astigmatism but a curved 
image surface with curvature 2/R. One way to flatten the image surface is to 
introduce another element whose astigmatism is zero, to a first approximation, 
and to choose its characteristics to make the Petzval curvature zero for the system. 
This is done with a thin lens located near the image surface, as shown in Fig. 
5.17. The contribution of the corrector plate is ignored in the analysis to follow 
because R, >> R for any practical focal ratio, a result that is evident from Eq. 
(4.5.13). 

The Petzval curvature for the mirror-lens combination, derived from the 


relation in Table 5.7, is 
2 n-1 1 1 
=i a 5.7.15 
*P R ( n VG x) ( ) 





peat 
R 


Fig. 5.17. Schmidt camera with lens L near focal surface to give zero Petzval curvature. See Eq. 
(5.7.15). 
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where R) and R, are the radii of curvature of the first and second surfaces, 
respectively, of the lens and the index n of the lens is positive. Setting R, = co 
gives a lens whose flat surface faces the image surface and, though this is not 
required, we make this choice for convenience. Therefore the Petzval surface is 
flat if 


n—-1 





(5.7.16) 


2 

nR R 
Because R of the mirror is negative, so also is R}, hence the lens is plano-convex 
in cross section (thickest at the center) and has positive power. Though the 
combination now has a flat Petzval field, the thin lens may introduce some 
astigmatism. We show in Chapter 9 that the amount is small if the lens is close to 
the image surface. 

Our final example is that of a two-mirror telescope, either Cassegrain or 
Gregorian. For light incident on the primary according to our usual convention we 
have n; = 1, n, = —1. The Petzval curvature is then 


1 1 2(\-p 
(9 [ee ee 71 
2 (5 r) Rı ( p ) ey) 


where p = R,/R,. For a Gregorian p < 0 and x, is opposite in sign to that of R,. 
Thus the Petzval surface for a Gregorian is convex as seen from the secondary. 
For a Cassegrain the Petzval surface is concave as seen from the secondary, 
provided p < 1. Discussion of the curvatures of the other astigmatic surfaces of 
two-mirror telescopes is left to Chapter 6. 


5.8. ABERRATIONS FOR DECENTERED PUPIL 


The aberration results given to this point are correct for an optical system in 
which all of the elements, including the aperture stop and pupils, are centered. By 
centered we mean there is a single axis, designated the z-axis, about which the 
system can be rotated without change, where the z-axis passes through the center 
of each element. If one or more of these elements is displaced laterally from the z- 
axis or rotated about a line perpendicular to the z-axis, the system is no longer 
rotationally symmetric and aberrations are introduced. The lateral displacement is 
commonly referred to as a decenter and the rotation as a tilt. Decenter and/or tilt 
of, for example, the secondary mirror in a two-mirror telescope is one important 
case of this loss of symmetry, and is discussed in Chapter 6. 
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5.8.4. GENERAL FORMULATION 


In this section we find the aberration coefficients for a general surface with its 
associated pupil where the center of the pupil is displaced from the z-axis of the 
surface. A cross section of this situation is shown in Fig. 5.18 where the pupil is 
displaced in the y-direction by LZ’. The chief ray passes through the center of the 
pupil and makes angle y with the z-axis, the axis of symmetry of the surface. In 
the paraxial approximation we find the following relations from the geometry in 
Fig. 5.18: 

1 
L=L'-Wy, g=u(i-2) +2, (5.8.1) 


where W is the distance from the surface to the entrance pupil, and the signs of 
each distance and angle are set by the sign convention (see caption to Fig. 5.18). 
The relations in Eq. (5.8.1) are a generalization of those in Eq. (5.5.2). 

The procedure of finding the aberration coefficients is the same as that 
followed in Section 5.5, except that Eq. (5.8.1) is used instead of Eq. (5.5.2) 
when substituting into Eq. (5.5.1). The results of carrying out these substitutions 
for coma and astigmatism are given in Table 5.8 for a general surface and Table 
5.9 for a mirror surface. Distortion is not included because the change in it is too 
small to be significant. The spherical aberration is independent of L’, hence 
B, = B, (cen). 

Examination of the entry for B, in Table 5.8 shows that the part of the coma 
coefficient that results from the decentering is not dependent on the angle of the 
chief ray. Hence the effect of the decentering is to introduce constant coma over 
the entire image field, in addition to any angle-dependent coma that is present. 
The effect of the decentering on B; is to introduce a constant term and a term that 





Fig. 5.18. Sketch of chief ray through center of stop displaced by L’ from the z-axis of the surface. 
The relation between parameters is given in Eq. (5.8.1). In this diagram L and L’ > 0, y and 0 > 0. 
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Table 5.8 


General Aberration Coefficients, Decentered Pupil? 





LTr/. 1 K, 

B = B,(cen) + | B(+- 2) — pe" =n) -6] 
L°?7T K, 

B, = B,(cen) 7 letee n+] 


-roof (eR) —9-9] 


1 l 
T=n?{ —-—— 
£ (= =) 


* B(cen) are entries in Table 5.5, with i = 1, 2. 


Table 5.9 


Mirror Aberration Coefficients, Decentered Pupil” 


nL’ m+1\ bR 
B, = B,(cen) + B [k ( ) | 


m-l 2n 








K 
B, = Bicer) + nt? ( sa >) 


RB 2n 
QnLiWw) {1 K+1 bR 
al K+, ) 





R wW R 2n 


° B(cen) are entries in Table 5.6, with i = 1, 2. 


depends linearly on the angle of the chief ray. Hence astigmatism is also present 
over the entire image field. 

The calculation of the system aberration coefficients is carried out following 
the procedure in Section 5.6. In all cases of interest, it turns out that the effect of a 
decentered stop is much greater on coma than on astigmatism, hence the image 
surface curvatures are not significantly affected and the results of Table 5.7 can be 
used with B}, (cen). 


5.8.b. EXAMPLE: SCHMIDT CAMERA 


At this point it is instructive to give an example of a system with decentered 
stop. The example discussed is that of a Schmidt camera in which the axis of the 
corrector plate is displaced from the mirror axis, as shown in Fig. 5.19. The 
aperture stop of the system is the corrector plate, with collimated light incident. 
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C 


Fig. 5.19. Schmidt camera with axis of corrector z, displaced from mirror axis z,, by L’. 


Because W = 0 for the corrector, its aberration coefficients from Table 5.5 are 
B, = 0, B, = 0, B, = —b/8, (5.8.2) 


where R = œ is the choice for the radius of curvature of the corrector. (For a 
corrector profile that minimizes chromatic aberration, R is finite as shown in 
Section 4.5. However, as we show in Chapter 7, the aberration coefficients are 
dominated by the term in b.) 
The parameters for the spherical mirror are W = R, m = 0, b = 0, and n = 1, 
in which case B,(cen) and B,(cen) are zero. From Table 5.9 we then find 
LS A 1 
ZR’ By = 3 By = Fa 


With the ray heights at the corrector and mirror equal, the system aberration 
coefficients according to Eq. (5.6.7) are simply the sums of corresponding terms 
in Egs. (5.8.2) and (5.8.3). Putting these sums into Eq. (5.6.6) and dividing by s’ 
to get the angular aberration gives 


1 /ry 3 /Ľ 
EN A TEs, 5.8.4 
ARS F(z) ae aal) G84) 


(5.8.3) 


for the angular astigmatism and tangential coma, respectively. Spherical aberra- 
tion is zero provided b = 2/R?. 

The relation for ATC in Eq. (5.8.4) can be used to find the largest permissible 
L’ for a given ATC. If we choose a blur limit of 1 arc-sec, then the reader can 
verify that L'/R, expressed in arc-seconds, cannot exceed 16F?/3. The value of 
AAS for this value of L’/R is about 1000 times smaller and thus is negligible. 

It is also instructive to take this same system and tilt the mirror with respect 
to the corrector, as shown in Fig. 5.20. From the geometry in Fig. 5.20 we see 
that L’ = —aR and y = 0 — a, where « is the tilt angle of the mirror and y is the 
angle of the chief ray relative to the mirror axis. The fact that y depends on « is of 
no consequence here because B,(cen) and B,(cen) for the mirror are zero, 
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Fig. 5.20. Schmidt camera with axis of mirror, denoted by dashed line, tilted by angle « with 
respect to the z-axis of the corrector. 


independent of Y. Hence Eqs. (5.8.2) through (5.8.4) are the same for this system, 
with « replacing L’/R. 

It is not surprising that the systems in Figs. 5.19 and 5.20 have the same 
aberrations because they are, in fact, equivalent. The tilt, in effect, has offset the 
center of curvature of the mirror by a distance L’ from the center of the corrector 
and, because the sphere has no preferred axis, the systems are the same. Note that 
this equivalence between a tilt and decenter does not hold for any surface that has 
a unique axis. 

Before leaving this system, it is worth examining Fig. 5.20 from the point of 
view of Fermat’s Principle. For 0 = 0, rays through the upper half of the corrector 
are advanced at the mirror while those through the bottom half are retarded. 
Hence an asymmetry is introduced into the reflected wavefront and the dominant 
aberration in the image is coma. 


5.8.c. EXAMPLE: EBERT-FASTIE MIRROR 


As a final example we consider a single concave mirror in combination with an 
intermediate plane mirror, as shown in Fig. 5.21. The beam from a point object 
located off the z-axis is converted to a collimated beam by one side of mirror M 
and returned to the other side of M by the plane mirror. The final image, like the 
original object, is approximately one focal length from the tangent plane to M. 
This optical arrangement is a so-called Ebert-Fastie system and is best known in 
a type of grating spectrometer of that name. In the spectrometer a diffraction 
grating replaces the plane mirror. Here we examine the aberration characteristics 
with a plane mirror only; the details of this system as a spectrometer are left for 
Chapter 15. 

The system shown in Fig. 5.21 has an entrance pupil decentered by Li at 
distance W, from the concave mirror M. The plane mirror is located at the focal 
point of M, a choice made to preserve the symmetry above and below the z-axis in 
Fig. 5.21. For the incident beam m = m, = œ; for the final beam m = m, = 0. 
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Ly 


et ae 


Fig. 21. Ebert-Fastie configuration with spherical mirror M. Entrance pupil is at distance W; from 
M and decentered by Li. See the discussion preceding Eq. (5.8.5). 


We consider first a special case, that in which the chief ray through the center of 
the entrance pupil is parallel to the z-axis, hence y; = 0. 

The entrance pupil is imaged by M and the plane mirror according to Eqs. 
(2.3.2) and (2.3.3). Applying those equations we find the entrance pupil for the 
second reflection from M located at distance L} above the z-axis and distance W, 
from M, where 


Wi 1 2 1 
L=-4l =h w=s-=z. W =R- W]. 8.5 

2 1 ( W, ) WTR W, 2 1 (5.8.5) 
With the condition that y; = 0 and, from the geometry in Fig. 5.21, Y, = 2L)/R, 
we now find the aberration coefficients using Table 5.9. With b = 0, and B; (cen) 
and B,(cen) zero for the first reflection, we find for this first reflection that 


B aia) B a eK 4) (5.8.6) 
1 = ’ 21 = B3 +o. 


Taking the entries from Tables 5.9 and 5.6, substituting Eqs. (5.8.5), and doing a 
bit of algebra, we find B,, = B,, and By, = —B,, for Y; = 0. 

We now find the system coefficients using Eq. (5.6.7). Because the beam size 
does not change between reflections, the system coefficients are simply the sum 
of the surface coefficients. Therefore 

ŽL? 
: R 
Although the separate reflections have coma, their signs are opposite and the net 
coma is zero. This is not surprising, given the symmetry on opposite sides of the 
z-axis. On the other hand, we see that astigmatism is present except when 
K = —1 and the mirror is a paraboloid. Thus a paraboloid, used as shown in Fig. 
5.21, if free of third-order spherical aberration, coma, and astigmatism. Any 
optical system for which this is true is called an anastigmat. 


B; 





(K+1), By, =0. (5.8.7) 
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Given this freedom from aberrations for the case with wy, =0, we now 
consider what happens for a paraboloid when yy, 4 0. The differences in this 
case are that B,(cen) and B,(cen) are not zero for the first reflection, and that the 
angle of the chief ray for the second reflection is Yy, = 2L)/R—(W,/W)wWy. 
With this result for y,, and using Eqs. (5.8.5), we find 


Bo, = aM 


=~ {K +1). (5.8.8) 


Thus the system has zero coma if the mirror is a paraboloid. Given this outcome 
we find the astigmatism only for K = —1. The result is 


wi 2W,\  2Liw 
By =% ae + = L= _B,,, (5.8.9) 





hence the system astigmatism is also zero. There is no third-order spherical 
aberration, coma, or astigmatism over the field spanned by yw. Although a 
paraboloid in this configuration is anastigmatic, there are higher-order aberrations 
that set the limit on image quality. Ray traces also show that the image surface is 
tilted and curved relative to the chief ray coming from the paraboloid. 

It is worth noting here that a spherical mirror in this configuration used in a 
monochrometer mode (7, = 0) has both spherical aberration and astigmatism. 
This is not a serious problem provided the beam focal ratio is not too small. 
Further discussion of this mode is given in Chapter 15. 


5.9. CONCLUDING REMARKS 


All of the results needed to calculate the aberrations of a general centered 
optical system to third order are now in place. By centered we mean there is a 
single axis of symmetry passing through the vertices of all the optical surfaces. It 
is well to remember that these results are not exact, but for most systems used in 
optical astronomy they are sufficient. Exact image characteristics derived from 
ray-tracing can, of course, be used to supplement the third-order results. 

A comparison of the form of the coefficients in this chapter with those in, for 
example, the book by Born and Wolf (1980), shows a significant difference in 
notation. The results given in Chapter 5 of their classic text are derived in terms of 
Seidel variables, while our results are given in terms of actual variables. Though 
the two approaches give the same final system aberrations, the representation we 
have chosen is more convenient to use in practice. A comparison of the Seidel 
results with those in this chapter is given in Appendix A. 
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With the availability of sophisticated computer ray-tracing programs, the 
reader may question the necessity of a detailed development of these analytical 
results. From the point of view of an optical designer starting from scratch to 
choose a suitable system for a particular application, the analytical results are 
preferred because one can usually determine rather quickly whether a given type 
of system is appropriate. Once a basic arrangement of optical elements has been 
selected, a computer can be used to optimize the system and check image 
characteristics. 

Getting the required aberration relations has been a lengthy process. It would 
have been sufficient simply to present the final results without the derivations, but 
for the reader who is venturing into this field for the first time it is useful to see 
the source of the results. Discussions in subsequent chapters are directed toward 
finding the characteristics of systems, with the results provided here available for 
reference. 

We also have the results needed to calculate the aberrations introduced when 
one or more of the optical elements in a system is decentered. The general 
treatment is complicated when more than one element is decentered, and we limit 
our following discussion to those cases in which one element is decentered. 

There is one more topic of aberration theory, which is covered in a later 
chapter. In Chapter 14 we use Fermat’s Principle as a starting point to derive the 
characteristics of diffraction grating surfaces. These results, when combined with 
those given here, will allow us to discuss the characteristics of a variety of 
spectrographic instruments. 


APPENDIX A: COMPARISON WITH SEIDEL THEORY 


Some of the key results derived using Seidel theory are shown in the following 
table. Of the five Seidel coefficients for spherical aberration, coma, astigmatism, 
field curvature, and distortion, we present results for all but field curvature. The 
interested reader should consult Chapter 5 in the text by Born and Wolf (1980) for 
all of the results, including derivations. 

Selected results from Table 5.5 in this chapter are in the left-hand column, with 
corresponding Seidel terms in the other columns. The quantities in square 
brackets represent the terms in brackets in Table 5.5. 

Use Eqs. (5.5.9), (5.6.7), and (5.6.8) to calculate the transverse aberrations 
using the entries from Table 5.5. 

Multiply each TA’ by (s'/n'h) to get the transverse aberrations using the Seidel 
approach. 


Bibliography 111 


Aberration Coefficients 





Table 5.5 Seidel Coefficient! TA’!2,3 
B, = ~310)] B = Ll0] Bp? 

B, = Zt l0) F = LHW [®)] ~Fyyp? 
a= -EP 6) c= yw] 208 
By =P io) B= LPO) -a 
'h = —y/p; Hy) = —Ww. For multisurface systems the 


Seidel coefficients are computed for each surface and 
summed to get a single coefficient. 

? TA’ is the transverse aberration in Seidel coordinates. 
3 p is the height of the marginal ray at the pupil; y is the 
height of the marginal ray at the surface. 
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Chapter 6 Reflecting Telescopes 


Reflecting telescopes and their associated instrumentation are the principal 
tools of the observational astronomer. In this chapter we consider the character- 
istics of the reflecting telescope in many of its various forms. Although refracting 
telescopes are still in use, they are relatively few in number and do not compete in 
light gathering power with the large reflectors. We choose to consider reflecting 
telescopes only. 

In the discussions to follow we consider the various kinds of reflectors, their 
inherent aberrations for a distant object field, and their advantages and limitations. 
Because of the aberrations there are definite field limitations, which are noted for 
each type. The aberration calculations are based on the results of Chapter 5, with 
the results of the calculations presented in terms of angular measure as seen on 
the sky (or object field). These measures are given in both analytical and 
numerical form, with the latter given in units of arc-seconds. Although close 
attention is given to the sign convention in deriving the aberration formulas, the 
final angular results are given without regard for sign. The one exception to this is 
the field curvature for which the sign is essential. 

In addition to giving the aberration characteristics of aligned two-mirror 
telescopes, we discuss the effect of misalignment between their mirrors. The 
trend in many of the recently designed large telescopes is to make them as short 
as possible, hence a “fast” primary mirror. Aberrations introduced by misalign- 
ments in such telescopes can be quite significant and much effort is devoted to 
keeping these aberrations within acceptable limits on a near realtime basis. 
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Descriptions of many of the types discussed here appear in the literature with 
references given at the end of the chapter. An especially complete treatment is 
given by Wilson (1996). In our discussion we cover a large number of telescope 
types with a common notation to facilitate comparison between them. In this 
chapter and succeeding ones our discussion assumes the reader has digested the 
main themes in the preceding chapters. If this is not the case, then, at a minimum, 
Section 2.5, Chapter 4, and Sections 5.4, 5.6, and 5.7 should be reviewed. Only 
pure mirror systems are considered in this chapter, including discussions of three- 
and four-mirror telescopes. Schmidt telescopes and systems with refracting 
corrector systems are the subjects of Chapters 7-9. 


6.1. PARABOLOID 


The single-mirror paraboloid is the simplest telescope that is free from 
spherical aberration, a result noted in Chapter 5. A paraboloid is almost always 
used with the aperture stop at the mirror and thus the aberrations and field 
curvatures can be taken directly from Tables 5.4 and 5.7. Results are given in 
Table 6.1, where y is the height of a marginal ray at the mirror and the telescope 
focal ratio F = |R/4y|. 

From the transverse aberrations given in Table 6.1 we find a coma flare 
directed away from the center of the field (TSC > 0), and a tangential astigmatic 
image lying closer to the mirror than the sagittal line image (TAS < 0). For the 
angular aberrations, we divide each of the transverse aberrations by s’ and drop 
any leading minus signs to get the results shown in Table 6.1. We choose this 
approach for the angular aberrations because it is usually their absolute size that 
is of primary concern. 

Results for angular aberrations from Table 6.1 are shown in Fig. 6.1 for three 
focal ratios. The principal item to notice in Fig. 6.1 is the dominance of coma for 


Table 6.1 


Aberrations of Parboloid Telescope 
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Angular aberration (arc- sec) 





8 (arc-min) 


Fig. 6.1. Angular aberrations of paraboloid in collimated light at selected focal ratios. Solid lines: 
sagittal coma; dashed curves: astigmatism. Number on each curve is focal ratio. See Table 6.1. 


small field angles, which sets the limit to the radius of the field over which the 
image quality can be considered “good.” By “good” we mean an angular blur 
size that is less than or equal to the blur given an otherwise perfect image by 
atmospheric distortion. In our discussions we take the typical blur due to 
atmospheric effects as 1 arc-sec. 

A typical comatic image is shown in Fig. 5.9. Setting the total span of this 
image equal to the atmospheric blur, we can use Fig. 6.1 and the relation between 
tangential and sagittal coma to determine the limiting field radius for “good” 
images. The results are given in Table 6.2 for a blur of 1 arc-sec, from which 
it is clear that the paraboloid is limited to small fields, especially for small focal 
ratios. 


Table 6.2 


Limiting Field Radius for 
Good? Images: Paraboloid 


Telescope 
F 0 (arc-min) 
1.42 
8 5.69 
10 8.89 


Good defined as tangential 
coma that measures | arc-sec. 
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With coma as the aberration that limits the size of the field, the placement of 
the aperture stop at any other position does not improve the image quality. Coma 
is independent of stop position when spherical aberration is zero, as noted in 
Section 5.5, and reducing astigmatism has little effect on the overall blur size. 

The remaining items in Table 6.1 are related to field curvature. Here Kps Km» 
and u,, are, respectively, the Petzval curvature, median curvature, and sag of the 
median surface midway between the tangential and sagittal focal surfaces. Given 
the limiting field radii in Table 6.2, the reader can verify that for any practical 
focal length f the median image surface is essentially flat. 

In summary, then, the paraboloid telescope is limited to small fields with coma 
setting the field limit. All other aberrations are negligible over this field. 


6.2. TWO-MIRROR TELESCOPES 


We introduced the topic of two-mirror telescopes in Chapter 2 with schematic 
diagrams of two types, Cassegrain and Gregorian, in Fig. 2.7, as well as a set of 
definitions of normalized parameters with which to describe any two-mirror 
telescope. Selected items from Section 2.5 and Table 2.1 are summarized in Table 
6.3 for convenient reference. 

It is instructive to study the relations between the normalized parameters in 
Table 6.3 because they define the bounds on the parameters for each of the 
possible telescope types. For all types we require that the final image is real. 

If the primary is concave, hence fy positive, the requirement of a real final 
image means mk > 0. If m and k are positive the telescope type is Cassegrain; if 


Table 6.3 


Normalized Parameters for Two-Mirror Telescopes 


k =y,/y, = ratio of ray heights at mirror margins 

p = R,/R, = ratio of mirror radii of curvature 

m = —S5/sy = f/f, =transverse magnification of secondary 

Jib = Dn = back focal distance, or distance from vertex of primary mirror to final focal point 

B and y, back focal distance in units of f; and D, respectively 

F, = | f\|/D = primary mirror focal ratio 

W = (1 —k)f, = distance from secondary to primary mirror = location of telescope entrance pupil 
relative to the secondary when the primary mirror is the aperture stop 

mkf, = distance from secondary to focal surface 

F = | f|/D = system focal ratio, where f is telescope focal length 








P mk 1+ 8 
= = k= 
Ë P= m-i m+1 
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their signs are negative the telescope type is Gregorian. In both cases |k| < 1 to 
ensure that some light reaches the primary. 

If the primary is convex, hence fı negative, then a real final image requires that 
mk is negative. In this case the secondary must be larger than the primary, hence 
k > 1, and m is negative. This type of telescope, with its concave secondary, is 
the so-called inverse Cassegrain. 

The different combinations of m, k, and p are summarized in Table 6.4. It is 
worth noting here that among the Cassegrains with concave secondary and the 
inverse Cassegrains are the so-called Couder and Schwarzschild designs that will 
be discussed later in this chapter. The Cassegrain with flat secondary is not 
included in the analysis and discussion to follow. 

We now proceed to find the aberration relations for two-mirror telescopes 
using Eqs. (5.6.11) and the aberration coefficients of the primary and secondary 
in Eqs. (5.6.9) and (5.6.10). Before writing the system aberration coefficients, W 
is written in terms of the normalized parameters: W/R, = (k —1)/2p, and 
W /s = (k — 1)/k. With these substitutions, and after straightforward but tedious 
algebra, the two-mirror aberration coefficients given in Table 6.5 are found. Note 
that these coefficients apply to any pair of conic mirrors, including pairs for 
which the spherical aberration is not zero. It is worth noting that B;, is the only 
aberration coefficient affected by the conic constant of the primary mirror. An 
error in K,, such as for the Hubble Space Telescope, has no effect on the off-axis 
aberrations. 

We can also use the condition for zero spherical aberration and rewrite the 
aberration coefficients in terms of K,. Setting B}, in Table 6.5 equal to zero we 
find, after more algebra, the results given in Table 6.6. These results are based on 
a choice of locating the aperture stop at the primary mirror. When spherical 
aberration is zero, coma is independent of the stop location; when both SA and 
coma are zero, astigmatism is independent of the stop position. We will comment 
further on these conditions when discussing specific types of telescopes. 


Table 6.4 


Parameter Combinations for Two-Mirror Telescopes? 


m k P Type Secondary 
>l >0 >0 Cassegrain convex 
=.) >0 foe) Cassegrain flat 

Otol >0 <0 Cassegrain concave 
<0 <0 <0 Gregorian concave 
<0 >1 >0 Inverse Cassegrain concave 


*For m= 1, k = (1+ )/2 
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Table 6.5 


General Aberration Coefficients for Two-Mirror Telescopes 
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Bos = lm + B)(3m + B) + Ga y (m — B) "| 


The choice of parameters used in Tables 6.5 and 6.6 is arbitrary and different 
combinations may be more convenient, depending on the application. The 
advantage of expressing the system coefficients in terms of m and f is that 
certain important conclusions are more easily deduced. 

Getting from the system coefficients in Tables 6.5 and 6.6 to the transverse 
aberrations requires substituting each B, in turn into Eq. (5.6.8) where, as noted 
in Section 5.6, s/n, = kf and y,;/y, = 1/k. Take care to note that Eq. (5.6.8) 
gives tangential coma; our results are given for sagittal coma. To get the angular 
aberration as an angle projected on the sky, the transverse aberration is divided by 
f and any leading minus signs are dropped. Angular aberrations are given in 
Table 6.7, with quantities in brackets taken from Table 6.5 or 6.6. 


Table 6.6 


Aberration Coefficients for Two-Mirror Telescopes with B3, = 0° 
6 m?>(m — B) 8 
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Bos = [r++ (Ki +0] 


“In terms of m and $, spherical aberration is zero according to 
the relation 


(m—1°(1 +B) m+1 
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Table 6.7 


Angular Aberrations of Two-Mirror Telescopes* 
1M 3 1 
A =f - = —] - 
(7) |- l-al] 
emy ae “A 
asc =2 (2%) -|=| =, ATC 
ag fil #2 te 
wee([-]-BE-] sen 


* Terms in square brackets are taken from Table 6.5 
or 6.6. 


Before discussing the characteristics of specific telescope types we give the 
general relations for the image surface curvatures based on the results in Section 
5.7. As noted following Eq. (5.7.13), the coefficient B,, and the angle 6’ are 
referenced to the last surface in the system, hence the secondary mirror. The 
relation between this 6’ and the field angle @ is derived by noting that the focal 
surface to secondary distance is k times smaller than f. Hence a point on the 
image surface, which subtends angle @ on the sky, subtends angle 6/k at the 
secondary. 

The coefficient B,, referenced to the secondary is calculated using the relation 
in Table 5.7. The relation between this result, denoted B,,(sec), and that given in 
Table 6.5 or 6.6, denoted B,,(pri), is k7B,,(sec) = B,,(pri). Therefore 


B,,(sec)/0” = B,,(pri)/0”, 


c 2 [mln = B) = (m@ +0) 
PR, m(1 + B) j 
_ 2 |m = 2)(m— B) + mn +1) _ m(m— py 
e ee et ae + 


(6.2.1) 


With all of the necessary relations now in hand, we turn our attention to a 
discussion of the characteristics of specific two-mirror telescopes. The categories 
considered in greatest detail are the so-called classical telescopes, those for which 
the primary mirror is a paraboloid, and the aplanatic telescopes, those with zero 
coma. We also discuss less widely used types, such as the Dall-Kirkham with its 
spherical secondary, a two-mirror version with spherical primary mirror, and 
several variants of the aplanat. 
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6.2.a. CLASSICAL TYPE 


This category of two-mirror telescopes is one for which K, = —1. The 
condition for zero spherical aberration then requires that the conic constant of 
the secondary is 


2 

K, = -(==) : (6.2.2) 
m—1 

For the Cassegrain, m > 0 and the secondary is a hyperboloid; for the Gregorian 

and inverse Cassegrain, m < 0 and the secondary is a prolate ellipsoid. 

With the substitution of K, = —1 in Eq. (6.2.1) and the formulas in Tables 6.6 
and 6.7, the aberration expressions are much simplified. For convenient reference, 
these relations are given in Table 6.8. 

The first thing to note about the relations in Table 6.8 is that the coma is 
exactly the same as that of a paraboloid of the same focal ratio, as given in Table 
6.1. Note also that this is true for either a Cassegrain or Gregorian, hence neither 
type has an advantage with respect to this aberration. 

To evaluate the astigmatism, we note that f is typically a small positive 
number of the order of a few tenths, while |m| is typically ten or more times 
larger. A good measure of the astigmatism is thus obtained by setting $ = 0, with 
the result that AAS = m6?/2F. A comparison of this result with AAS in Table 
6.1 shows that a classical telescope whose focus is at the primary mirror vertex 
has astigmatism |m| times larger than that of a paraboloid of the same F. As in the 
case of coma, there is no discriminant due to astigmatism between Cassegrain and 
Gregorian types. 

The astigmatism for 8 = 0 can also be written as AAS = 0°/2F 1- Thus the 
astigmatism in this case depends only on the focal ratio of the primary mirror. A 
comparison of the tangential coma blur size with the astigmatic blur diameter 


Table 6.8 


Aberrations of Classical Two-Mirror Telescopes 








6 
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shows that coma is almost always the aberration that sets the limiting field size for 
good images, as the reader can verify by constructing a diagram similar to that of 
Fig. 6.1. Spot diagrams for an f/10 Cassegrain with m = 4 and f = 0.25 are 
shown in Fig. 6.2. The images for a classical Gregorian are very similar. 

Looking at the curvature of the median image surface in the case 8 = 0, we 
see that it is approximately 2(m+1)/R,. This relation is not exact, but it 
illustrates three features of the image surface. First, the sign of K„ is opposite 
for the Cassegrain and Gregorian types; the surface of best images for the 
Cassegrain (Gregorian) is concave (convex) as seen from the secondary. Second, 
the curvature is larger for the Cassegrain than for the Gregorian. And, third, the 
median image surface is more strongly curved for larger |m|. This, however, is 
rarely a limitation because the field covered is usually smaller in angle when m is 
larger. 

In summary, the classical two-mirror telescope is limited to small fields with 
coma setting the field limit for good images. Compared to the paraboloid the 
astigmatism is larger, but for small fields this is rarely a limiting factor. With 
coma as the dominant aberration, and independent of the stop location when 
spherical aberration is zero, the location of the aperture stop could be changed, if 
necessary, without significantly changing the character of the images. Thus it is 
acceptable to locate the aperture stop at the secondary mirror, as is often done in 
infrared telescopes. 
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Fig. 6.2. Spot diagrams for 1.2-m f /10 classical Cassegrain with m = 4 and $ = 0.25. Box width 
is 2 arc-sec. 
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Because of the small field size, distortion is typically a few thousandths of an 
arc-second and thus much smaller than the atmospheric blur. Compared to the 
asymmetry of a comatic image, as seen in Fig. 6.2, distortion is not important. 
The only differences between the Cassegrain and Gregorian of the classical type 
are the sign and magnitude of the image surface curvature but, given the relatively 
small usable fields, these differences are usually of little consequence. 


6.2.b. APLANATIC TYPE 


The classical telescope is clearly limited in field coverage by the presence of 
coma in the off-axis images. In this section we consider the category of telescopes 
for which, to third order, coma is zero. As noted in Section 5.4, any optical 
system in which both spherical aberration and coma are absent is called an 
aplanat. In recent years the aplanatic Cassegrain telescope, or Ritchey-Chretien 
as it is commonly called, has been the overwhelming choice of builders of large 
telescopes of 2-m aperture or larger, including the 2.4-m Hubble Space Tele- 
scope. Thus this class of telescope has been carefully studied and merits our close 
attention. An extensive article by Wetherell and Rimmer (1972) is an additional 
source of information on aplanatic telescopes, as is the text by Wilson (1996). 

It should not be surprising that both spherical aberration and coma can be 
eliminated in a system with two conic mirrors. A glance at the condition for zero 
spherical aberration in Table 6.6 shows that there are two free parameters, the 
conic constants of the mirrors. One conic constant is chosen to make B,, in Table 
6.6 zero, after which the condition for zero spherical aberration sets the other. 
Thus we find that the conditions for an aplanatic telescope are 


E 2(1 + B) 
K, — —1 ~ mm- (6.2.3) 
(mth) 2mm+)) 
K, = (z - 7) ay OE (6.2.4) 


For the Ritchey-Chretien (RC) the primary is now a hyperboloid, as is the 
secondary. The conic constant for the secondary of the RC is more negative than 
for the classical Cassegrain. For the aplanatic Gregorian (AG) the primary is now 
an ellipsoid. The conic constant for the secondary of the AG is more negative 
than that of the classical Gregorian, provided |m| > 1, but the conic is still 
ellipsoidal. 

In each case the two mirrors have been “bent” in the same direction in the 
manner shown in Fig. 4.10. However, the direction of deformation for the mirrors 
of the RC is opposite that for the AG, as the reader can easily verify. 
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Table 6.9 


Aberrations of Aplanatic Two-Mirror Telescopes 
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Substitution of Eq. (6.2.3) into Eq. (6.2.1) and the coefficients in Table 6.6 
gives the aberrations for the aplanatic telescopes, with the results given in Table 
6.9. 

As with the classical telescopes, we choose 8 = 0 to determine the approx- 
imate magnitudes of the aberrations. The results are 


eas see apt == (m? —2) K =2m+1) (6.2.5) 
~ OF oy 4 ” Ri B 


Compared to the classical type at the same focal ratio, the astigmatism for the RC 
is larger while that of the AG is smaller. At a given R,, the curvature of the 
median image surface is larger for the RC than for the AG, with the curvatures 
again of opposite sign. A comparison of «,, with x, in Eq. (6.2.1) shows a median 
surface more strongly curved than the Petzval surface for the RC, but less 
strongly curved for the AG. 

The distortion is the same for both types of aplanatic telescope, and is slightly 
less than for a classical type with $ = 0. At the edge of the usable field of an 
aplanatic telescope the distortion is usually a few hundredths of an arce-second, 
and may need to be taken into account in certain types of observations. 

Spot diagrams for an f/10RC telescope with m = 4 and f = 0.25 are shown 
in Fig. 6.3. Note that the field size shown in Fig. 6.3 is twice that of the classical 
Cassegrain in Fig. 6.2 and that at the edges of their respective fields the image 
quality of the RC is significantly better. Note also that the image blur due to 
astigmatism is symmetric and therefore the centers of the images can be located 
more accurately. Because both coma and spherical aberration are zero, the 
location of the stop does not affect the astigmatism. 

In summary, the aplanatic two-mirror telescope has a field limit for good 
images set by astigmatism. Given the symmetric images and significantly larger 
field of the RC compared to the classical Cassegrain, it is not surprising that the 
RC has become the telescope of choice in Cassegrain telescopes. Further 


6.2. Two-Mirror Telescopes 123 














| | 

| | 

IË | & 

| g á| anii» ÈD | = 

= | | | 

G 
g | | 
g = — ° | | ¢ | 








0.0000 DEG 
D 

e 

e 
® 











-220 -100 @ 100 220 3 









THROUGH FOCUS SPOT DIAGRAM 





| RITCHEY-CHRETIEN 
35 


SAT FEB 6 19 SPOT SIZE UNITS ARE MICRONS 
| FIELD 


i 2 3 
RMS RADIUS 14.939 3.628 @.172 
TUS 19.811 5.168 @.2% 

BOX WIDTH 6.4 REFERENCE CHIEF RAY 





Fig. 6.3. Spot diagrams for 1.2-m f/10 Ritchey-Chretien telescope with m = 4 and f = 0.25. 
Box width is 2 arc-sec. Off-axis field angles are two times larger than in Fig. 6.2. 


comparison of aplanatic and classical telescopes follows a brief discussion of 
other selected two-mirror telescope types. 


6.2.c. OTHER TWO-MIRROR TELESCOPES 


In addition to the classical and aplanatic two-mirror telescopes, there are other 
less common types that deserve comment. Because each of these types has one or 
more serious drawbacks, our discussion of each is brief. In this section we 
consider in turn the Dall-Kirkham, two-mirror with spherical primary, two kinds 
of anastigmatic telescopes, and a flat-field aplanat. 

The Dall-Kirkham telescope is one in which the secondary is spherical 
(K = 0) and the primary is ellipsoidal, with the appropriate value of K, found 
from the relation for zero spherical aberration in Table 6.6. It is straightforward to 
find the coma coefficient by setting K, = 0 in B,, of Table 6.5 and, for similar 
normalized parameters, compare its value with that of a classical Cassegrain. For 
B = 0 the Dall-Kirkham has coma that is (m? + 1)/2 times larger than that of the 
classical Cassegrain, hence the field of good images for the Dall-Kirkham is 
smaller by this same factor. All other aberrations are negligible over this field. 

Although the Dall-Kirkham is severely limited in its field coverage, the mirrors 
are relatively easy to build and test, as discussed in Section 18.1, and several 
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telescopes of this type have been built. One other advantage of the Dall-Kirkham 
is that its on-axis image quality is relatively insensitive to misalignments between 
the mirrors, as compared to the classical and aplanatic types. 

Another type of two-mirror telescope that has some attractive features is one 
with a spherical primary mirror (SP). The main advantages of the SP design are 
ease of fabrication and testing of large spherical mirrors, and the possibility of 
making very large segmented primaries by using a number of smaller spherical 
mirrors. Designs for SPs with primaries as large as 25m in diameter have been 
proposed. Zero spherical aberration with an SP design requires a convex oblate 
ellipsoidal secondary (K > 0) in the Cassegrain version and a concave hyperbo- 
loid (K < —1) in the Gregorian version, as the reader can verify. 

The drawbacks of SP designs are the large off-axis aberrations. This is easily 
verified by setting K, = 0 in the coefficients in Table 6.6 and comparing the 
results with those found for a classical type with K, = —1. For ĝ = 0 these ratios, 
SP to classical, are (m° + 2)/2 for coma and (m? + 4)/4 for astigmatism. 
Choosing m = 4 we find a coma ratio of 33 and an astigmatism ratio of 5. 
Relative to a classical Cassegrain the off-axis aberrations are indeed very large. 
The situation for off-axis aberrations is only slightly better for an SP Gregorian. 

Given these characteristics, SP types are limited to very small fields or 
additional optical elements must be added to achieve a reasonable field size. If, 
for example, additional mirrors were added in the vicinity of the secondary, then 
the aberrations of the overall system could presumably be reduced to acceptable 
levels. But in this case it is no longer a two-mirror telescope. 

The remaining two-mirror telescopes considered are variations of the aplanatic 
type, specifically those for which another aberration is corrected. Because the 
conic constants of the mirrors in an aplanat are chosen to give zero spherical 
aberration and coma, elimination of another aberration will put restrictions on the 
remaining normalized parameters. The available choices are easily found by 
setting each expression in Table 6.9 equal to zero in turn, with a specific 
combination of m and 8 now required. This combination, in turn, places 
restrictions on the remaining parameters. 

The zero-distortion type is of little practical importance because distortion is 
quite small in two-mirror telescopes with small fields of view, and we will not 
discuss this type. The remaining choices are the zero-astigmatism type, or 
anastigmatic aplanat, and the flat-field aplanat. 

For the anastigmatic aplanat the pertinent relations between the parameters are 


B = —m(2m + 1), k=1-2m, mk = m(1 — 2m). (6.2.6) 
The condition for a real final focus requires magnification in the range 


0 <m < 0.5 when the primary is concave. For any m in this range, the secondary 
is also concave and the focal surface is located between the mirrors. This type of 
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telescope, the so-called Couder, therefore suffers from the problem that the focal 
surface is relatively inaccessible. For a reasonable choice of m, say 0.25, it also 
has a relatively large secondary obscuration compared to the Ritchey-Chretien. 
One final thing to note is that the telescope focal length, in general, is one-half the 
distance between the primary and secondary mirrors. A diagram of a Couder 
design is shown in Fig. 6.4. 

Another type of anastigmatic aplanat is found if the primary is convex and 
k > 1. From Eq. (6.2.6) we find m < 0, hence there is a real final focus and the 
configuration is that of an inverse Cassegrain. For m in the range —0.5 < m < 0, 
the focal surface lies between the mirrors because B > 0. (Recall that Bf, is the 
back focal distance, as defined in Section 2.5, with the focus outside of the space 
between the mirrors when Bf, > 0. For an inverse Cassegrain, fi < 0 and the 
focus is outside of the mirrors when £ < 0.) A sketch of this configuration with 
f > 0 shows that a blocking plate must be centered in the incident beam to 
prevent the focal surface from seeing the incident light directly. This configura- 
tion also has the problem that a significant fraction of the incident light is 
reflected back through the hole in the secondary. 

For m < —1 some of the light reflected from the secondary passes outside the 
boundary of the primary, and if m is sufficiently negative a significant fraction 
reaches the focus. A feasible configuration of this type is one in which each 
mirror is a sphere. Substituting $ = —m(2m + 1) into Egs. (6.2.3) and (6.2.4), 
and setting K, = K, = 0, gives m = —(1 + ./5)/2. The resulting configuration is 
the concentric Schwarzschild anastigmat, with the mirrors and curved focal 
surface having the same center, as shown in Fig. 6.5. The reader can verify that 
the fraction of light vignetted by the primary is 0.2 for this telescope. 

Because the secondary mirror is larger than the primary, and because of 
problems with vignetting and unwanted light reaching the focal surface, this 
configuration is not really practicable for a telescope. However, it has been used 
as the basis for cameras in spectrographs. 


a ; 
Pa 


Fig. 6.4. Couder anastigmat with m = 0.25 and k = 0.5. 
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Fig. 6.5. Schwarzschild concentric anastigmat with C the center of curvature of surfaces. 
Parameters are m = —(1 + ./5)/2 and k = 2 + /5. 


The flat-field aplanat is defined by k,, = 0, with the relations for selected 
parameters given by 
po, ope eS Uy 
m—1 m2 — 1 


(6.2.7) 
An analysis of these relations leads to two possible types: Cassegrain with 
concave secondary and focus between the mirrors, and inverse Cassegrain. Each 
of these types suffers from the same problems of image inaccessibility and 
relatively large vignetting as the corresponding anastigmat. 

And, finally, it is worth noting that a solution for a flat-field anastigmat can be 
found by equating the relations for f in Eqs. (6.2.6) and (6.2.7) and solving for m. 
The result of this exercise is m = +1/./2. It is left for the reader to show that 
only the negative solution gives a configuration with a real final image. In this 
case the primary mirror is again convex with the secondary (1 + ./2) times larger 
in diameter than the primary, hence not practical as a telescope. 

For more details on all of these variations of the aplanat, the reader should 
consult the article by Wetherell and Rimmer (1972). A thorough discussion of all 
the telescopes covered in this section is also given by Wilson (1996). 


6.2.d. COMPARISON OF CLASSICAL AND APLANATIC TYPES 


From the discussion in the preceding sections it should be clear that a two- 
mirror classical or aplanatic telescope can be most easily tailored to meet the 
varied observing demands of astronomers. There is great flexibility in these 
designs to provide the required magnification and image surface accessibility 
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with generally acceptable vignetting of the incoming beam by the secondary 
mirror. Some of the shortcomings of these telescopes can be overcome with 
additional optical elements. Examples include flattening a curved focal surface or 
increasing the usable field size. These topics are discussed in Chapter 9. 

It is therefore appropriate at this stage to take a specific set of parameters and 
show all of the characteristics of each type of classical and aplanatic telescope. 
The parameters chosen are characteristic of those for a typical two-mirror 
telescope with B= 0.25 selected to give an accessible focal surface. Each 
telescope has the same primary mirror and overall focal ratio. A listing of 
these parameters, including the conic constants, is given in Table 6.10. 

The important characteristics of each telescope type, using the parameters in 
Table 6.10, are given in Table 6.11. In addition to the angular aberrations, entries 
are included that provide a normalized measure of the size of each telescope 


type. 


Table 6.10 


Parameter for Two-Mirror Telescopes? 


Parameter CC CG RC AG 
K, —1.000 —1.000 —1.0417 —0.9632 
K, —2.778 —0.360 —3.1728 —0.4052 


“CC, Classical Cassegrain; CG, Classical Gregorian; RC, 
Ritchey-Chretien; AG, Aplanatic Gregorian; F; = 2.5, |F| = 10, 
B = 0.25, jm| = 4. 


Table 6.11 


Characteristics of Two-Mirror Telescopes*? 


Parameter ce CG RC AG 





m 4.00 —4.00 4.00 —4.00 
k 0.25 —0.417 0.25 —0.417 
ak 0.75 1.417 0.75 1.417 
mk 1.000 1.667 1.000 1.667 
ATC 2.03 2.03 0.00 0.00 
AAS 0.92 0.92 1.03 0.80 
ADI 0.079 0.061 0.075 0.056 
Ky Ry 7.25 —4.75 7.625 5.175 
KR, 4.00 —8.00 4.00 —8.00 


“Parameters are those of telescopes in Table 6.10. Aberra- 
tions are given at a field angle of 18 arc-min in units of arc- 
seconds. 

b Coma is given in terms of tangential coma. 
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From the results in Table 6.11 we can deduce the approximate field angle at 
which the dominant angular aberration is equal to the diameter of a star image 
blurred by atmospheric effects or “seeing.” If the blur diameter is 1 arc-sec, the 
field angle at which the aberration blur equals the “seeing” blur is about 9 arc- 
min for the classical telescopes dominated by coma, about 18 arc-min for the RC, 
and about 20 arc-min for the AG. Thus the field diameter is roughly a factor 
of two larger for the aplanatic type of telescope and the field area is 4 times 
larger. 

From the astigmatic surface curvatures in Table 6.11 we find that, in absolute 
value, the median surfaces have greater curvature and the Petzval surfaces have 
smaller curvature for the Cassegrain types, as compared with the Gregorian types. 
The median surface curvature is also somewhat larger for the aplanatic type 
compared to its classical counterpart. Because R, < 0, the astigmatic surfaces as 
seen from the secondary are concave and convex for the Cassegrain and 
Gregorian types, respectively. 

If aberrations were the only discriminant of the four telescope types in Table 
6.11, the aplanatic Gregorian would emerge as the preferred choice. Other 
factors, however, strongly favor the RC and it is this type that has been the 
overwhelming choice for new large telescopes over the past three decades. The 
reasons for this choice are to be found in rows 2—4, Table 6.11. 

Recall that k is the ratio of the secondary-to-primary diameter for an on-axis 
light bundle, and thus k? is the minimum fractional area of the primary obscured 
by the secondary. The parameter (1 — k) is the separation of the primary and 
secondary in units of f}, while mk is the distance from the secondary to the final 
focal surface in the same units. 

Obstruction of the light by the secondary in the Gregorian is clearly larger than 
in the Cassegrain, hence the latter has the edge. Comparing values of (1 — k) for 
the Cassegrain and Gregorian types we find that the primary-secondary separation 
is almost 1.9 times larger for the Gregorian. We also find that the distance from 
the secondary to the focal surface is nearly 70% larger for the Gregorian. Thus for 
a given focal length and primary and final focal ratio, the physical length of the 
Gregorian is substantially greater. 

This greater length has two very significant impacts on the choice of a 
telescope and the cost of an observatory facility. First, the cost of a building 
and dome needed to house the telescope is significantly greater for a larger 
telescope. For a large telescope the building costs are usually comparable to the 
cost of the telescope. Second, the cost of the Gregorian telescope itself is greater 
because the framework supporting the mirrors is longer and more massive. This 
framework must keep the mirrors in proper alignment if the image quality is to be 
held to the values given in Table 6.11. In Section 6.3 to follow we consider the 
effects of misalignment of the primary and secondary and show that significant 
aberrations can be introduced if the mirrors are not properly aligned. 
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The one feature of the Gregorian that might be important for certain types of 
observations is its real exit pupil. A physical stop located here will act to suppress 
stray light scattered from the support structure of the telescope. Unless this 
specific feature of the Gregorian is an essential one, the smaller and less 
expensive RC is preferred. 

Given the preference for a Cassegrain telescope over a Gregorian, it is also 
instructive to compare the characteristics of Cassegrain telescopes with the same 
diameters, focal lengths, and back focal distances. We choose to make this 
comparison for RC telescopes, with the results shown in Table 6.12. Note 
especially the shorter overall telescope length, and thus the expected ease of 
mirror alignment when the primary mirror is “faster.” We also see that the 
astigmatism is somewhat larger for the faster primary mirror. In spite of the larger 
astigmatism at the same field angle, the advantages of a shorter telescope are 
substantial. 

The choice between the Ritchey-Chretien and classical Cassegrain is not as 
clearcut as that between Cassegrain and Gregorian. For most large telescopes 
intended for stellar observations, the Ritchey-Chretien has been the favored type, 
although the classical Cassegrain was the choice for the Keck 10-m telescopes. 
The discussion in the following section indicates one possible reason for 
choosing the classical configuration. 


6.2.e. HYBRID TYPES 


To take advantage of the design flexibility of Cassegrain telescopes, many 
are provided with interchangeable secondaries with each primary-secondary 
Table 6.12 
Comparison of f/10 Ritchey-Chretien Telescopes 


Parameter Ref? RC, RC, RC; 





m 4.00 6.00 7.00 8.00 
F, 2.50 1.67 1.43 1.25 
Alfitef) 1.00 0.667 0.571 0.500 
B 0.25 0.375 0.438 0.500 
k 0.25 0.196 0.180 0.167 
mkf, /f (ref) 1.000 0.786 0.719 0.667 
AAS? 1.03 1.35 1.49 1.62 
KR 7.63 9.65 10.53 11.34 


“Parameters of reference RC telescope in Tables 6.10 and 
6.11. 

? Astigmatism is given at a field angle of 18 arc-min in units 
of arc-seconds. 
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combination giving a different telescope focal length and focal ratio. With a 
secondary other than the one designed for the Cassegrain focus, the telescope 
focus is usually located at a different physical position. For a telescope in an 
equatorial mount plane mirrors redirect the light, with the final beam directed 
along the polar axis of the telescope to the so-called Coude focus. For a telescope 
in an altitude-azimuth (alt-az) mount a plane mirror directs the light along the 
altitude axis to the so-called Nasmyth focus. In both cases m and f are usually 
larger than at the Cassegrain focus. 

For a classical telescope the conic constant of the secondary is given by Eq. 
(6.2.2) for each m selected, and the telescope is still of the classical type. The 
relations in Table 6.8 apply, with the normalized parameters for the Cassegrain 
replaced by the new values. 

A Ritchey-Chretien primary in combination with a different secondary, on the 
other hand, is no longer aplanatic and the results in Table 6.9 do not apply. This 
type of telescope is not a Ritchey-Chretien and we choose to call it a hybrid 
telescope. The aberration coefficients given in Table 6.6 apply to hybrid 
telescopes, provided K, for the original RC primary is used. Denoting the 
parameters for the RC as m, and f, the conic constant of the primary is, 
according to Eq. (6.2.3), given by 


2(1 + b.) 


x aay m2(m, i B) 


(6.2.8) 


The conic constant of the secondary is set by the condition that the spherical 
aberration of the hybrid telescope is zero. Substituting K, from Eq. (6.2.8) into 
the zero spherical aberration relation in Table 6.6 gives 





2 
k,=-(74)) 2m'(m +1) _ (1 +8) (6.2.9) 


m—1)  (m—13(1 + pmm. — b)’ 


where m and f are parameters for the hybrid. Substituting K, from Eq. (6.2.8) 
into the coma and astigmatism coefficients in Table 6.6 gives 


0 m\? (m—B\(1+8, 
Pu = gya | (me) Ae ar 


= O| m+B m (m—B\ (148, 
Hee Fata (FS) G) ew 


where f is the focal length of the hybrid telescope. 
It is evident from Eqs. (6.2.10) and (6.2.11) that the aberrations are different 
from those of the aplanatic telescope and that coma is not zero. A good measure 
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of the amounts of coma and astigmatism present is found by setting both £ and $, 
to zero, with the results 


asc =—?_|1_-(7) aas= "| mai (™ (6.2.12) 
~ 16F? mj) V — OF 2\m,.) | aa 


For typical values of m/m,, three or more, the coma of the hybrid telescope is 
much larger than that of the classical type of the same focal ratio and, although 
the astigmatism is also larger, the size of the usable field is set by coma. 

Dropping the one in the expression for ASC in Eq. (6.2.12), ignoring the 
minus sign, and substituting F = mF, we can write ASC = F0/16(m,F |)’. Fora 
given ASC at the edge of the usable field, FO and hence f0 is a constant for a 
given RC primary. Because f0 is the linear radius of this field at the hybrid focus, 
the larger the magnification of the hybrid secondary the smaller is the usable field 
in angular measure. Although the field size is smaller, the observations made at a 
Coude or Nasmyth focus are most often made on or near the axis where coma is 
not significant. 





6.2.f. AFOCAL TYPES 


As a final class of two-mirror telescope we consider those that are afocal, 
hence the output beam is collimated and the final image is at infinity. One 
possible application of such a telescope is as a beam reducer, if the secondary 
mirror is smaller than the primary, or as a beam expander if the secondary is 
larger. An even more important application is as the input end of a three- or four- 
mirror telescope. In Section 6.4 of this chapter we discuss selected designs of 
three-mirror telescopes. In anticipation of that section, it is convenient to 
determine the characteristics of afocal two-mirror telescopes and include them 
with other two-mirror telescopes. 

We begin by noting that an afocal telescope is one for which m, the 
magnification of the secondary mirror, is infinite. Thus we can take previously 
derived relations and have them apply after letting m —> oo. Because f, the 
normalized back focal distance, also becomes infinite in this limit, it is convenient 
to express $ in terms of m and k before taking this limit. Following this procedure 
for the aberration coefficients in Table 6.5, we get the results shown in Table 6.13. 
Also included is the Petzval curvature from Eq. (6.2.1). 

Examination of the entries in Table 6.13 show that the obvious choices for 
conic constants are K, = —1 and K, = —1, hence both mirrors are paraboloids. 
With these choices we see that the afocal two-mirror telescope is free of spherical 
aberration, coma, and astigmatism, and hence is an anastigmatic aplanat. We 
point out that the same conclusions can be reached by taking Eqs. (6.2.3) and 
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Table 6.13 


General Aberration Coefficients for Afocal Two- 
Mirror Telescopes 


1 1 
B3; =” +1-k(K, +1) = -3z 
1 1 
6fad—-k 8 
Bo, =a" + D] =A 


ETA-k) @ 
Bu=-F| 4k +0] =e 


P-k) 








Ba = gaz HOGHE + (1 — WP Ka 
3 
= OT fork, = -1 


aea 
PRR 


(6.2.4), and AAS from Table 6.9, and letting m — oo. Note that in the afocal 
limit there is no difference between the classical and aplanatic types. 

We now determine some of the characteristics of pupils and chief ray 
directions. From Table 6.3 we see that mk is the normalized distance from the 
secondary to the focal point, while from Eq. (2.6.1) we find that the normalized 
distance from the exit pupil to the focal point is 6. For an afocal telescope the 
ratio 6/mk = 1, or 6/m = k. Substituting this result into Eq. (2.6.3) we see that 
Y, the chief ray angle after reflection from the secondary, is given by y = 0/k. 
With the aperture stop at the primary mirror, it is left as an exercise for the reader 
to show that the distance from the secondary mirror to the exit pupil of an afocal 
telescope is given by W’ = —k(1 — k) fj. It is also straightforward to show that 
the diameter of the exit pupil is |AD|, where D is the diameter of the primary. 

As a final comment about this type of telescope, we note that it follows from 
the stop-shift relations in Section 5.5 that the location of the aperture stop is 
arbitrary in this type of telescope. That is, spherical aberration, coma, and 
astigmatism are zero for any position of the aperture stop. This freedom to 
move the stop and make use of the stop-shift relations will help in the analysis of 
some of the three-mirror telescopes discussed in Section 6.4. 


6.3. ALIGNMENT ERRORS IN TWO-MIRROR TELESCOPES 


We now consider the consequences of an error in the position of the secondary 
mirror relative to the primary in a two-mirror telescope. This position error can be 
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a decenter and/or tilt of the secondary, either of which is a misalignment between 
the axes of the mirrors. The error can also be an axial displacement of the 
secondary toward or away from the primary, in which case the error is called 
despace. In the discussion to follow, aberrations introduced by misalignment are 
treated separately from those resulting from despace. 


6.3.a. TILT OR DECENTER MISALIGNMENT 


In analyzing the effect of misalignment, we begin by noting that the aperture 
stop of the telescope is the primary and the reference axis for the secondary is the 
axis through the vertex of the primary. We consider both coma and astigmatism in 
our general discussion of misalignment, but apply our results for astigmatism 
only to the case of the aplanatic telescope. For the classical telescope the 
astigmatism due to misalignment has little effect on the overall image quality 
and can generally be ignored. 

A possible layout of a misaligned secondary is shown in Fig. 6.6, where the 
secondary is decentered by an amount / in the y direction and tilted through an 
angle a about a line perpendicular to the plane of the diagram and tangent to the 
mirror at its vertex. In this particular case the displacement of the center of the 
stop from the axis of the secondary, its symmetry axis, is simply the sum of the 
separate displacements due to decenter and tilt. In the general case the displace- 
ments at the stop due to decenter and tilt are not colinear and must be combined 
by vector addition. We consider only the case shown in Fig. 6.6, however, 
because compensation of coma, specifically, due to misalignments, requires 
colinear displacements as shown in Fig. 6.6. 








Pri Í 


Fig. 6.6. Secondary (Sec) in two-mirror telescope decentered by / and tilted by angle « with 
respect to axis of the primary (Pri). The relation between parameters is given in Eq. (6.3.1). 
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From the geometry of Fig. 6.6 we see that 
LU = -—(l +W), y = —(0 + a), (6.3.1) 


where L’ is the distance from the center of the stop to the axis of the secondary 
and y is the angle between the reflected chief ray and the secondary axis. 
Substitution of Eq. (6.3.1) into the coefficients in Tables 5.6 and 5.9 gives 


1f: m+1 m+1 
Ba = Batad + ala e- (a)l -Gea 


= B (sec) + B(mis), (6.3.2) 

l es l W 1 (1 2W0 

By =B -— — 20 —}{1—-—}+4+k,—{—- —— 
ý ave R K : w) 7 (: a rae a) Bee R, (è R, )| 
= B (sec) + B,>(mis) (6.3.3) 
where n = —1 for the secondary. The factors B,,(sec) and B,,(sec) are the coma 


and astigmatism coefficients, respectively, from Eq. (5.6.10) and are the coeffi- 
cients for a properly aligned secondary. We are interested primarily in the effects 
of the terms in Eqs. (6.3.2) and (6.3.3) denoted by B,(mis) with i = 1, 2. 

We take Eqs. (6.3.2) and (6.3.3), substitute into Eq. (5.6.11), and get 


B», = By,(cen) + k°By(mis), (6.3.4) 
B,, = B,,(cen) + k7B,>(mis), (6.3.5) 


where B;, (cen) are the coefficients for an aligned telescope from Table 6.6. Note 
that Eqs. (6.3.4) and (6.3.5) apply specifically to the situation shown in Fig. 6.6, 
that is, along the y-axis. Generalization to an arbitrary point on the image is 
considered later in this section. A thorough discussion of the effects of alignment 
errors across the image field is given by Shack and Thompson (1980). 


6.3.b. IMAGE SHIFT FROM MISALIGNED SECONDARY 


In addition to the introduction of coma and astigmatism, a misaligned 
secondary will shift the image field perpendicular to the z-axis in Fig. 6.6. For 
tilt « and no decenter, the chief ray from an on-axis object point is reflected from 
the secondary at angle 2« relative to the z-axis. Hence the transverse shift of the 
chief ray at the image is 2amAf,, a shift in the positive y-direction for a Cassegrain 
or Gregorian telescope with mk and « > 0. For decenter / and no tilt, the on-axis 
chief ray emerges from the secondary at angle 29 relative to the z-axis, where 
~ =1/R, is the slope of the surface normal at the point where the chief ray 
hits the secondary. The transverse shift of this chief ray at the image is 
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2omkf, = —l(m — 1). For] > 0 this shift is in the negative y-direction for m > 0 
(Cassegrain) and in the positive y-direction for m < 0 (Gregorian). 

For a chief ray in the yz-plane making angle 0, with the z-axis, as shown in 
Fig. 6.6, the angular shift projected on the sky, w, is given by 





A 1E o (m-1)I 
YV =0 +(+) = 0, +24(0 7 =) (6.3.6) 


+ 


The corresponding transverse shift is found from the product fi’ or mfy 


6.3.c. COMA FROM MISALIGNED SECONDARY 


The amount of transverse coma introduced in a two-mirror telescope with a 
misaligned secondary is easily found by combining Eqs. (6.3.2) and (6.3.4), and 
substituting B,, into Eq. (5.6.8). To get the angular coma as an angle projected on 
the sky, the transverse coma is then divided by the telescope focal length. The 
general result along the y-axis in Fig. 6.6 is 


4a 
ATC = ATC(cen) — oath am (x = ==) +a(m — o| 





(6.3.7) 


The principal feature of Eq. (6.3.7) is that the coma due to misalignment is 
independent of the field angle 0, hence it is constant over the field. 

For an aplanatic telescope ATC(cen) = 0 and, after substitution for K, from 
Eq. (6.2.4), we get 


_ 3(1+8Xm-1) Im 1 
ATC = oF E o fı ee zll (6.3.8) 


For a classical telescope we take ATC(cen) from Table 6.8, substitute for K, from 
Eq. (6.2.2), and find 





ATC = 





30, 301 +P- 1) [e m, (6.3.9) 


16F2 — 16F? fk 


From Egs. (6.3.8) and (6.3.9) it is apparent that coma due to misalignment can be 
made zero by a proper combination of tilt and decenter. 

To illustrate the effects of misalignment, we take the set of telescopes whose 
parameters are given in Table 6.10 and evaluate the relations in Eqs. (6.3.8) and 
(6.3.9) for each type. Setting 0 = 0 for the classical telescopes, and scaling the 
telescopes by choosing D = 3.6 m, we get the results shown in Table 6.14 for 
l = 3 mm and « = 3 arc-min. The entries in Table 6.14 retain the signs given by 
the relations in Eqs. (6.3.8) and (6.3.9). 
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Table 6.14 


Angular Tangential Coma for Misaligned Secondary”? 


cc CG RC AG 
ATC(dec) 1.93 1.93 2.10 2.02 
ATC(tilt) 1.27 2.11 —1.27 2.11 


* Angular coma is given in units of arc-seconds. Parameters 
of telescopes are given in Table 6.10 with D = 3.6 m. 
P a = 0.05° = 3.0 arc-min; / = 3 mm = | f|/1.2E4. 


From the results in Table 6.14 we see that the secondary in the chosen 
Gregorians is more sensitive to tilt than is the secondary in the Cassegrains. There 
is little difference between the coma introduced by decenter, although the 
aplanatic types have slightly larger values. Recall, however, that the Gregorian 
is significantly longer than the Cassegrain and hence the required tolerances are 
more easily met with a Cassegrain. 

It is important to note that coma contributions due to the separate misalign- 
ments can be significantly larger in a telescope with a faster primary mirror. 
Taking the telescope labeled RC; in Table 6.12 we find, for the same tilt and 
decenter, that ATC(tilt) is 2.8 times larger and ATC(dec) is nearly 8 times larger 
than for the RC telescope in Table 6.14. 

We return now to the aplanatic telescope. From Eq. (6.3.8) we find ATC = 0 if 


Im 


] 
=i tee) 


x (6.3.10) 





For an RC telescope with m > 1, and k and f > 0, we see from Eq. (6.3.10) that « 
and / have the same sign, while for an aplanatic Gregorian their signs are 
opposite. The importance of the result in Eq. (6.3.10) is that, even if the primary 
and secondary mirrors are not aligned, there is a tilt that compensates for decenter 
and gives an image free from coma due to misalignment. 

Any combination of tilt and decenter is equivalent to a rotation of the 
secondary around an axis that is perpendicular to the axis of the primary and 
intersects it. For the particular combination of « and / in Eq. (6.3.10) the 
intersection of these two axes is called the neutral point, and its location on 
the primary mirror axis depends on the type of telescope. Denoting d,,, as the 
distance from the secondary mirror to the neutral point, we find from Eq. (6.3.10) 


l 


1 =f 


dyp =mi 
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The sign convention applies to d,,, in Eq. (6.3.11). For a typical RC telescope « 
and / have the same sign and the neutral point lies to the left of the secondary 
mirror in Fig. 6.6. For an aplanatic Gregorian the neutral point is to the right of 
the concave secondary. Because the quantity multiplying kf is less than unity, the 
neutral point is between the secondary mirror and the focal point of the primary 
for an aplanatic telescope. The existence of a neutral point can be used to 
advantage if the secondary is deliberately displaced to bring a different source on 
to a fixed detector as, for example, is done with many infrared telescopes. 

We now turn our attention to a classical telescope and Eq. (6.3.9). The 
situation is more complicated than for an aplanatic telescope because the 
condition for zero coma from Eq. (6.3.9) has three independent parameters, 
8y, a, and 7. We choose, therefore, to find the combination of these parameters 
that makes ATC zero for an image on the axis of the primary mirror. 

Setting Y’ = 0 in Eq. (6.3.6), solving for 0,, and substituting for 0, in Eq. 
(6.3.9), we find 


ATC (on-axis) = — [a +1)- aon -1 | (6.3.12) 


Setting Eq. (6.3.12) to zero and solving for « we find 


da = Kf, [mt l 


oa hae (6.3.13) 


The directions of the neutral point from the secondary mirror for classical 
telescopes are to the left and right, respectively, for the Cassegrain and Gregorian 
versions. Because the quantity in curly brackets in Eq. (6.3.13) is greater than 
unity, the distance to the neutral point is slightly larger than |kf,|, the distance 
from the secondary to the focal point of the primary mirror. 

It is worth noting that the condition for zero coma on the axis of the primary 
gives a different relation between «œ and / than obtained by setting 6, = 0 in Eq. 
(6.3.9) and finding the combination of tilt and decenter that then makes ATC = 0 
for an object point on the primary mirror axis. In this latter case the point of zero 
coma is shifted from the axis of the primary and the neutral point is at the focal 
point of the primary. 

Representative combinations of « and / that make coma zero are given in Table 
6.15 for both types of Cassegrain telescopes. In all cases Z = 3mm, with 0, 
chosen to represent the two cases discussed in the preceding. 

As a final note, recall that Eqs. (6.3.4) and (6.3.9) apply along the y-axis as 
they are written. We can generalize Eq. (6.3.4) for a classical telescope and have it 
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Table 6.15 


Zero Coma Combinations for Cassegrain Telescopes”? 





a(arc-min) O(arc-min) y (arc-min) 
RC 4.99 not relevant 
CC (zero coma for image on primary mirror axis) 
4.25 —1.26 0 
CC (zero coma for object on primary mirror axis) 
4.58 0 1.43 


* Parameters of telescopes are given in Table 6.10 with D = 3.6 m. 
21 = 3 mm = |f|/1.2E4. 


apply at an arbitrary image point for a tilt and decenter combination shown in Fig. 
6.6. The result is 


Bar = 


1l 
->i 1 — -- 6.3.14 

italo- tom ja- 6304 
where i and j are unit vectors along the x- and y-axes, respectively. The magnitude 
of B,, is found from the components in the usual way, while the direction of the 
coma flare is along the line from the image point to the point on the y-axis where 
the coma is zero. 


6.3.d. ASTIGMATISM FROM MISALIGNED SECONDARY 


Misalignment of the mirrors in a two-mirror telescope also introduces 
astigmatism in addition to that already present, as shown by Eqs. (6.3.3) and 
(6.3.5). In this section we consider the nature of this added astigmatism and how 
it affects the inherent astigmatism already present. 

In our analysis we consider only the situation where coma due to misalignment 
is zero. Applying this condition to Eq. (6.3.2) gives 


l m-l 


Substituting Eq. (6.3.15) into Eq. (6.3.3), expressing W and R, in terms of 
normalized parameters, and rearranging terms, gives the astigmatism coefficient 
for the telescope as 


Ns mse l m 
ET A | 9 
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For a given decenter we see from Eq. (6.3.16) that the astigmatism due to 
misalignment is /inear in the angle 6, over the field. The subscript is applied to 
this angle, as it is in our discussion of coma, because we are considering 
specifically the case shown in Fig. 6.6. 

There is also a term in Eq. (6.3.16) that is constant over the field. In many 
instances this constant term is negligible compared to the linear term, especially 
near the edge of the usable field. Each case must be analyzed individually to 
determine whether the constant term can be ignored. 

From this point on we consider only aplanatic telescopes. We do this because it 
is coma that largely determines the image quality in Cassegrain telescopes, as 
shown in Fig. 6.2. A change in the astigmatism due to mirror misalignment will 
have little noticeable effect on the usable field of a Cassegrain unless the values of 
a and / are unreasonably large. 

For aplanatic telescopes we get 


_ FP [mQm+1)+B) 1/1, @-ly l m 
Ais = oat 2m(1 + p) | 2f (pjk (1 + B) fe, * 7 2m — Bm — 5 
(6.3.17) 


O & [mQm+1)+B] 1/1, @-1y I m 
n A 2m + B) l sF(7)™ OFA oe a 


(6.3.18) 














where we can write 0° = 0° + 6. 

We now evaluate Eq. (6.3.18) for the RC telescope in Table 6.10 with the tilt 
and decenter combination in Table 6.15. With 6, and @, expressed in units of arc- 
min, Eq. (6.3.18) becomes 


AAS (are-sec) = —3.16E-3[0} + 0 — 5.420, — 0.276]. (6.3.19) 


Note that the constant term contributes less than 0.001 arc-sec, a negligible 
amount. We now take a cut along the y-axis by setting 0, = 0 and get the 
resulting AAS shown in Fig. 6.7. Also shown are AAS with / = 0 and the linear 
AAS due solely to misalignment. From Fig. 6.7 and Eq. (6.3.19) we find 
AAS = 0 at 0, = 0 and 0, = 5.42 arc-min. We also see from Fig. 6.7 that the 
effect of the addition of linear astigmatism is to shift the curve for aligned mirrors 
and to put the minimum in the curve at a point midway between the corrected 
points, that is, the shifted curve appears symmetric on either side of 0, = 2.71 
arc-min. 

Through-focus spot diagrams at selected values of 0, are shown in Fig. 6.8 
relative to a focal surface whose curvature is computed from the relation in Table 
6.9. From Figs. 6.7 and 6.8 we see that AAS in the range 0 < 0, < 5.42 arc-min 
is of opposite sign to AAS outside of this range. In the range given, the sagittal 
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Angulor astigmatism (arc—sec) 





Field angle (arc—min) 


Fig. 6.7. Angular astigmatism (solid line) for Ritchey-Chretien telescope misaligned along y-axis. 
The AA is the sum of astigmatism for aligned telescope (dotted line) and linear astigmatism (long 
dashed line). Parameters of the telescope are given in Table 6.10; tilt and decenter parameters are given 
in Table 6.15. See the discussion following Eq. (6.3.18). 


image (along the y-axis) is closer to the secondary mirror than the tangential 
image; outside of this range the sagittal image is farther from the secondary. 

We also see in Fig. 6.8 that the astigmatic blur circles midway between the line 
images are displaced by differing amounts from the curved focal surface. This 
suggests that the proper curved focal surface for the misaligned case is both 
shifted and tilted relative to the focal surface for aligned mirrors. If the vertex of 
the aligned focal surface is offset by 45.5 mm (2.71 arc-min plus y’ from Eq. 
(6.3.6) over the telescope scale of 0.0955 arc-min/mm), displaced by 0.18 mm in 
the -z direction, and tilted about the x-axis by 0.79°, the through-focus spot 
patterns are then those shown in Fig. 6.9. The symmetry of the blur circles on 
opposite sides of 0, = 2.71 arc-min is now apparent. 

We now take Eq. (6.3.19) and rewrite it as 


AAS (arc-sec) = —3.16E-3[6? + (0, — 2.71)? — 7.07]. (6.3.20) 
y 


From Eq. (6.3.20) we see that AAS can be expressed in terms of 6? = 6? + 0, 
where 0, = 6, — 2.71, and a constant term. When written in this form we see that 
the center of symmetry of the astigmatic patterns along the y-axis has shifted to 
6, = 0. Thus, for example, sagittal images far from the symmetry point will fall 
on lines through the symmetry point. Examples of this are shown in Fig. 6.9. 
It is instructive to evaluate Eqs. (6.3.15) and (6.3.18) for the RC; telescope 
with its f/1.25 primary in Table 6.12. For a decenter of 3 mm the tilt required to 
give zero coma is 14.0 arc-min and, in Eq. (6.3.20), 0, = 6, — 9.84 arc-min, the 
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Fig. 6.8. Through-focus spot diagrams for misaligned Ritchey-Chretien telescope. Scale bar on 
upper left is 1 arc-sec long. See the caption of Fig. 6.7 for source of the parameters. 


constant in square brackets is 96.8, and the multiplier is —4.96E-3. Astigmatism 
due to misalignment for this telescope is substantially larger. 

It is clear from our discussion that correcting coma in an aplanatic telescope by 
a tilt and decenter combination satisfying Eq. (6.3.15) does affect the astigma- 
tism, with the principal effect a decentered and tilted astigmatic focal surface. It is 
also evident from Fig. 6.7 that the effect of linear astigmatism leads to differences 
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Fig. 6.9. Through-focus spot diagrams for misaligned Ritchey-Chretien telescope with shifted and 
tilted focal surface. Scale bar on upper left is 1 arc-sec long. See the discussion following Eq. (6.3.19). 


that are largest at the edge of the usable field. McLeod (1996) describes how 
observations of the astigmatic images at the edge of the field can be used to find 
the tilt and decenter. It is then a straightforward procedure to reduce the values of 
g and / while maintaining zero coma. These effects on astigmatism due to 
misalignment have also been discussed by Wilson and Delabre (1997) in 
connection with the ESO “New Technology Telescope” (NTT). 


6.3. Alignment Errors in Two-Mirror Telescopes 143 
6.3.e. DESPACE ERROR 


We now turn our attention to the aberrations that appear when the error in 
placement of the secondary is one of despace. If the secondary is not at its 
nominal design position, then spherical aberration and coma are introduced and 
images at all points in the image field are degraded. Although spherical aberration 
is larger than coma, results are given for both aberrations. Astigmatism is also 
introduced but its size is negligible by comparison. 

The starting point for the calculation of spherical aberration resulting from 
despace is B}, from Table 6.5 and ASA from Table 6.7, with the terms in square 
brackets in B;, substituted into the relation for ASA. If m is expressed in terms of 
k and p, the only variable parameter remaining is k. The position of the secondary 
relative to the focal point of the primary is given by s, = —kfi, hence a change in 
k means an axial shift of the secondary. 

Taking the derivative of ASA with respect to k, and resubstituting for k and p 
in terms of m, we find 


2 
(ASA) = aa {nen —1)-(m-— v|k $ (=) I}: (6.3.21) 


If dk is the change in k starting from the position of the secondary where the 
spherical aberration is corrected, then d(ASA) is the angular spherical aberration 
resulting from the despace, or simply ASA. 

It is now a simple matter to evaluate this relation for different types of 
telescopes and determine the sensitivity to despace. Using the relations for K, in 
Eqs. (6.2.2) and (6.2.4) for classical and aplanatic telescopes, respectively, the 
results are 


m(m? — 1) ds, 





ASA (classical) = EE fA A (6.3.22a) 
; m(m — 1) 2 ds 
ASA (aplanatic) = 1 : 3; 
(aplanatic) IEF | + NTE 5 P (6.3.22b) 


A comparison of the relations in Eqs. (6.3.22) shows that aplanatic telescopes are 
somewhat more sensitive to despace error than are the classical type, though only 
by 10-15% for typical parameter values such as in Table 6.10. Comparing ASA 
for the aplanatic telescopes in Table 6.16 shows that the Ritchey-Chretien is more 
sensitive by a few percent to error in secondary position. 

A final thing to note about Eqs. (6.3.22a,b) is that, to a good approximation, 
ASA is inversely proportional to the cube of primary mirror focal ratio. Hence a 
telescope with a “faster” primary is more sensitive to despace error. A similar 
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Table 6.16 


Angular Aberrations for Despaced Secondary*? 


RC AG 


ASA 0.912 0.846 
ATC 0.252 0.174 


* dk = dsy/f, = 0.001. 

» Aberrations are given in units of arc-seconds, 
ATC is given at a field angle of 18 arc-min, and 
parameters of telescopes are given in Table 6.10. 


conclusion was already noted here for secondary misalignments. In general, a 
telescope with a faster primary is more sensitive to alignment errors of any kind. 

The calculation of coma introduced by a despaced secondary proceeds in a 
similar way. We start with the coma coefficient B}, in Table 6.6 and ATC from 
Table 6.7, express all variables in terms of k and p, and differentiate with respect 
to k. The result for the aplanatic telescope is 


30 [a +14 
16F? (1 + B) m—B\f- 


Corresponding results for other two-mirror telescopes are of little importance 
because the coma already present in the off-axis images is dominant over that 
introduced by despace. 

A comparison of the relative sizes of ASA and ATC for aplanatic telescopes 
with despaced secondary is given in Table 6.16, with the parameters of the 
telescopes taken from Table 6.10. 


ATC = (6.3.23) 





6.4. THREE-MIRROR TELESCOPES 


With the addition of a third mirror to a reflecting system there are additional 
degrees of freedom to minimize or eliminate aberrations. It is possible, for 
example, to design systems free of third-order spherical aberration, coma, and 
astigmatism with flat image surfaces. Such three-mirror flat-field anastigmats can 
be found in a variety of practical configurations, unlike the case for two mirrors 
where there is only one possible configuration. 

The general analysis of a three-mirror system in terms of aberration coeffi- 
cients is considerably more complicated than that of a two-mirror system. 
Because of this complexity, we will only outline the procedure and apply it to 
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some relatively simple examples, those in which there is collimated light between 
the secondary and tertiary mirrors. Such three-mirror systems are generally 
referred to as Paul-Baker telescopes. For a discussion of the general approach 
to the design of three-mirror telescopes the reader should consult the article by 
Robb (1978). An especially thorough discussion of the many possible three- 
mirror configurations is given by Korsch (1991). The interested reader should 
also consult the text by Wilson (1996). 


6.4.4. GENERAL FORMULATION 


Setting up the general relations that describe a three-mirror system is a 
straightforward extension of results given in Chapter 5. The system aberration 
coefficients, referenced to the primary, are 


y j+l 5 j+l 
B, = By + Bip (2) + B3 >) ;  7=0,1,2,3, (6.4.1) 
1 1 


where the subscripts 1, 2, and 3 refer to the primary, secondary, and tertiary 
mirrors, respectively. Note that Eq. (6.4.1) is simply an extension of Eq. (5.6.11). 
The other relation of interest is that for the Petzval curvature, which, from Table 
5.7, is given by 


1 1 1 


The general procedure is now one of selecting the system configuration, mirror 
separations and radii of curvature, and adjusting the conic constants to eliminate 
third-order aberrations. This procedure is best carried out with optimization 
routines available in ray-tracing software. We will not pursue this general 
approach but instead discuss some special cases. In our analysis we will make 
use of some of the special properties of afocal two-mirror telescopes. 


6.4.b. EXAMPLE: PAUL-BAKER TYPE 


The starting point for the Paul-Baker telescope, hereafter denoted PB, is a 
Cassegrain afocal telescope of the type discussed here in Section 2.f. The mirror 
pair consists of two paraboloids, a concave primary and a convex secondary, 
whose focal points coincide. This combination, shown in Fig. 6.10, converts an 
input beam of diameter D into a collimated output beam of diameter kD, where 
k =y>/y, =fo/f; = R./R,. As shown in Section 2.f of this chapter this afocal 
reducer has zero spherical aberration, coma, and astigmatism. 
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k | ———+ 


Fig. 6.10. Afocal beam reducer. Ratio of beam diameters = k = f/f- 


We now add to the afocal reducer a concave spherical tertiary mirror whose 
center of curvature is at the vertex of the secondary mirror, as shown in Fig. 6.11. 
Note that the placement of the tertiary is similar to that of the spherical mirror in a 
Schmidt camera. Because we have added a spherical mirror in collimated light the 
system now has spherical aberration and, to compensate, the paraboloidal 
secondary is replaced by a sphere, which introduces spherical aberration of 
opposite sign. If R} = R}, these two contributions of spherical aberration are 
equal in absolute magnitude and the system is again free of spherical aberration. 
It was first noted by Paul in 1935 that this system is also free of third-order coma 
and astigmatism with a focal surface whose curvature k = 2/R}. 

The fact that this three-mirror system is free of third-order aberrations can be 
shown in two ways. The first way is to evaluate Eq. (6.4.1) for j = 1,2,3 and 


|~— 24, —>| 


Fig. 6.11. Paul-Baker three-mirror telescope with focal length f = f,/k. 
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show directly that B,, = 0, Bə, = 0. Substitutions needed in the j = 1, 2 coeffi- 
cients for the tertiary are 





6 W, R, [1—k 
h= m=0 Fare B(G*). 643 


This method is straightforward but tedious and does little to show the elegance of 
the original Paul system and its variants. 

A more insightful approach is to make use of the stop-shift relations and draw 
on our discussion of the Schmidt camera in Chapter 4. In our analysis of the 
afocal telescope given here in Section 2.f we pointed out that the third-order 
aberrations, though derived for the stop at the primary, are zero for any position 
of the aperture stop, a result that follows from the stop-shift relations in Chapter 
5. Therefore let us place the aperture stop at the secondary. We now have a system 
in which the tertiary spherical mirror is illuminated with collimated light from a 
stop at its center of curvature, a combination optically similar to a Schmidt 
camera. 

As noted in Chapter 4, a spherical mirror plus aperture stop at the center of 
curvature has no preferred axis and is free of coma and astigmatism. The 
spherical aberration of the mirror is eliminated by an aspheric plate located at 
the stop; the wavefront advance at the mirror is compensated by an equal 
wavefront retardation from the aspheric plate. Because this plate is located at 
the stop, it does not introduce coma or astigmatism. 

Returning now to the PB system, the wavefront advance at the spherical 
tertiary mirror is compensated by an equal wavefront retardation introduced by 
changing the paraboloidal secondary to a sphere. But this change of the 
secondary is entirely equivalent to introducing an aspheric plate, as seen by 
comparing the r+ terms in Eq. (5.1.1). Hence no coma or astigmatism is 
introduced into the PB design by changing the conic constant of the secondary. 

This conclusion is especially important because it means that the original Paul 
design with R} = R, can be generalized to systems with R; 4 R2, provided two 
conditions are met: (1) the center of curvature of the spherical tertiary is located at 
the vertex of the secondary; and (2) the conic constant of the secondary is chosen 
to give zero spherical aberration for the complete system. 

The family of variants of the original Paul design is easily found by starting 
with the relation for spherical aberration obtained from Eq. (6.4.1). Substituting 
the spherical aberration coefficients for the secondary and tertiary mirrors from 
Table 5.6 into Eq. (6.4.1) gives 


1 K+l b 
4 2 
Bs, = (Fe aR3 -3) (6.4.4) 
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where k = y3/y} = y3/y,. The choice of parameters that makes B}, = 0 includes 


2|/R\* 
K, =0, b= (2) -1 |, (6.4.5a) 
2 Al R, l 


R 3 
b=0, K, =—-1+ (2) l (6.4.5b) 


These choices are, of course, equivalent through terms in r* in Eq. (5.1.1), as 
substitution of Eqs. (6.4.5a) and (6.4.5b) in turn demonstrates. The first of these 
combinations can be described as an aspheric figure on a spherical mirror, the 
other as an ellipsoidal mirror. With either combination it is a straightforward 
exercise to verify directly that coma and astigmatism are zero, but with our use of 
the stop-shift relations it is not necessary to do so. For fast systems the solutions 
given by Eqs. (6.4.5a,b) must be supplemented by aspheric terms of higher order 
on the mirrors to control fifth and higher-order aberrations. 

We now take the general Paul design and add the condition for zero Petzval 
curvature, an analysis first done by Baker (1969). Writing Eq. (6.4.2) in terms of 


k gives 
2 1 ( z)| 
K, =—|1—--{1-—}]. (6.4.6) 
ý al k R; 


Setting K, = 0 gives R,/R; = l — k. 

Additional relations for Paul-Baker telescopes are given in Table 6.17. Note in 
particular that the choice of the primary mirror focal ratio and any two from 
among the mirror separation ratio, obscuration ratio k, or R/R, set the basic 
parameters of the telescope. Once these parameters are selected and a field size is 
chosen, the diameters of the secondary and tertiary mirrors and the effective 
obscuration ratio can be computed. 


Table 6.17 


General Relations for Paul-Baker Telescopes? 





Mirror separation ratio: Pee is 2k 
P ‘1-27 R, (I-k 


: LE Of Rs 
Focal ratios and lengths: FTRT R 
Diameter of secondary: D, = D[k + 2F,0(1 — k)} 
Diameter of tertiary: D; = D[k + 2F,001 — k) + 4F 0] 
Effective obscuration ratio: k'D = D} + 4kF@D 


R 1 
“For flat-field Paul-Baker telescopes use E =p 
; 5 
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The first telescope of the PB type built is the CCD/transit instrument 
described by McGraw et al. (1982), and located at the Steward Observatory of 
the University of Arizona. This telescope has a 1.8-mf/2.2 primary with 
k 7% 0.32, and near-diffraction-limited images over a 1° field of view. Because 
of the baffles required to prevent light reflected directly from the primary or 
tertiary from reaching the detector, the vignetting by the central obscuration is 
approximately 22%. For further details, the reader should consult the article by 
McGraw et al. (1982). 

There are several examples of PB designs in the literature. One example is a 
PB telescope with an f/1 primary and f/2 final focal ratio, and excellent image 
quality over a 1° diameter field, by Angel, Woolf, and Epps (1982). In their 
design, aspheric terms are put on the secondary and tertiary mirrors. The nominal 
parameters of the design are k =0.168 and R,/R, =2, hence the mirror 
separation ratio from the relation in Table 6.17 is 0.808. 

Willstrop (1984) has published designs of PB telescopes with curved focal 
surfaces and one with a flat focal plane. One particular design with a curved focal 
surface has a field of view of 4° diameter, with f /1.6 for both the primary mirror 
and overall telescope. He chooses to place the focal surface at the vertex of the 
primary mirror, thus the mirror separation ratio = 2 and k = 0.5. By allowing 
higher-order aspherics on each of the three mirrors, Willstrop is able to achieve 
image diameters under 0.31 arc-sec over a 4° field. For further details the 
interested reader should study the articles by Willstrop. 

Because of the excellent image quality achievable with the PB system, a more 
detailed look into the characteristics of a representative PB is in order. We choose 
a flat-field version with an f /1.575 primary and k = 0.28 covering a field of 1.6° 
diameter, as shown in Fig. 6.12, with surface parameters given in Table 6.18. 
With the addition of aspheric terms in yó and y8 to the secondary and tertiary 
mirrors, and the application of an optimization algorithm in a raytrace package to 
these terms, the diameters of images range from about 0.25 arc-sec diameter near 
the center of the field to 0.5 arc-sec at the edge. No attempt was made to further 
improve the image quality by letting other parameters vary, although improve- 
ment is expected. If aspheric terms are added to all of the mirrors, then image 
diameters are 0.1 arc-sec or less over the entire flat field. 

Given the excellent image quality that can be obtained with the Paul-Baker 
design, it is legitimate to wonder why more telescopes with this configuration 
have not been built. Factors that could be considered shortcomings for this type of 
telescope are: (1) limited volume available for instrumentation behind the focal 
surface; (2) relatively large vignetting because of baffles required to shield 
the tertiary mirror from extraneous light; (3) effect of additional optics, such as 
a filter or atmospheric dispersion corrector, on image quality; and (4) constraint 
on the overall focal length and focal ratio. We will discuss each of these 
briefly. 
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Fig. 6.12. Flat-field Paul-Baker telescope with f/2.19 overall and field diameter of 1.6°. See 
Table 6.18 for the mirror parameters. 


With the focal surface located midway between the secondary and tertiary 
mirrors, much of the volume behind the focal surface is taken up by light 
traveling to and from the secondary. Although there is enough space for detectors 
used for direct imagery, the same cannot be said for slit spectrographs. Using a 
telescope such as this for spectroscopy would require optical fibers feeding a 
bench spectrograph. 

The focal surface is in the focal plane of the tertiary, hence this mirror must be 
shielded from starlight from sources outside of the nominal field-of-view (FOV). 
At a minimum this means a circular obstacle behind the secondary of the size 
shown in Fig. 6.12. The size of this obstacle is the diameter of the tertiary needed 
to accept all light from the primary within the FOV plus an annulus that excludes 
light from within the FOV from entering the telescope aperture and going directly 
to the tertiary. This diameter is given in Table 6.17 as k’D, along with the 


Table 6.18 


Parameters for Flat-Field Paul-Baker Telescope? 





Surface R(mm) K Separation(mm) 
Primary —11025 -I 
—3939 
Secondary —3087 —0.62675 
4287.5 
Tertiary —4287.5 0 








*D = 3.5 m, k = 0.28, Fi = 1.575, F = 2.1875. 


6.4. Three-Mirror Telescopes 151 


diameters of the secondary and tertiary mirrors. For k = 0.28, F = 2.1875, and 
0 = 0.8°, the computed value of k’ = 0.468 and, at a minimum, the fraction of 
the incident light lost before reaching the primary is about 0.22. 

There are two possible locations for additional optical elements, such as filters 
or an atmospheric dispersion corrector (ADC), one in the collimated light beam 
in the plane of the focal surface, the other in the convergent beam just before the 
focal plane. In the former location the beam diameter is about 1.2 m for a 3.5-m 
telescope. Making a filter or ADC of this size and holding it in position would be 
a formidable task. Such an element would also give additional vignetting of the 
converging beam from the primary and increase the fraction lost to about 0.31. 

Thus any additional optics would likely be placed in the converging beam near 
the focal plane. A plane-parallel plate in this beam will shift the focus by an 
amount given by Eq. (2.4.5) and introduce both coma and spherical aberration. 
Using the aberration coefficients for such a plate from Eqs. (7.2.11) and (7.2.12), 
we find that the transverse aberrations of a glass plate of index n and thickness t 
are 








n—] t n? — l t 
TSA = ( on lar TTC = 30( on Ja (6.4.7) 


The blur introduced by a 6-mm plate of BK7 glass in our representative PB 
telescope is of the order of 0.2 arc-sec. Hence the monochromatic image quality 
at best focus is only slightly degraded. 

A more serious consequence of introducing a plate into the converging beam 
is longitudinal chromatic aberration or LCA, the change in focus with wave- 
length. We can determine the approximate LCA by computing how A in Eq. 
(2.4.5) changes with wavelength. The result for a plate of thickness f¢ is 


dA dAdn_ tdn 
di dndi ndà’ 





(6.4.8) 
hence 
LCA = ôA = ôn, (6.4.9) 
(n) 


where ôA is the change in focus for an index difference ôn and average index (n). 
The blur size is a minimum at the midpoint between the extreme focus positions, 
with the blur diameter ¥ôA/2F. For a 6-mm BK7 plate transmitting from 400 to 
700 nm, we find ôn = 0.018, (n) = 1.52, LCA = 47 um, and a blur diameter 
projected on the sky of approximately 0.3 arc-sec. The combined effects of TSA 
and LCA over this wavelength range give an effective blur of about 0.4 arc-sec. 
Willstrop (1984) has designed an ADC for an f/1.6 beam and points out the 
substantial amount of LCA that is an inevitable part of such a system. 
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The final limitation of the Paul-Baker telescope, at least in comparison with a 
two-mirror telescope, is the lack of freedom in choosing the final focal ratio. 
From Table 6.17 we see that F is tied to F, by the choice of mirror radii. We also 
see that for the same field size the effective obscuration ratio k’ increases as F and 
F, increase, hence all proposed designs are based on a fast primary mirror with a 
final focal ratio that is significantly smaller that for a typical Cassegrain. 

Thus the Paul-Baker design, in spite of its excellent image quality over a field 
significantly larger than that of a Ritchey-Chretien telescope, has not been the 
choice for large telescope systems. 


6.4.c. OTHER THREE-MIRROR TELESCOPES 


If the constraint of collimated light between the secondary and tertiary mirrors 
is removed, then many three-mirror telescope designs with excellent image 
quality are possible. In this section we present only a few such designs to 
illustrate some of these possibilities. 

A design by Korsch (1972) has a slowly converging beam between the 
secondary and tertiary mirrors and a flat focal surface just outside of the space 
between the mirrors. The layout is shown in Fig. 6.13 for an f/3 primary, f /4.5 
overall, and a field diameter of 1.2°, with the parameters given in Table 6.19. 
Note that each of the mirrors is hyperbolic in cross section. Image quality is 
excellent with 0.1 arc-sec diameters over the flat field. Although the focal surface 
is now easily accessible, the price paid is relatively large obscuration by the 
tertiary of the converging beam from the primary, with the fraction of light lost at 
about 0.35. 

Another flat-field design by Korsch (1977) is shown in Fig. 6.14 with the 
system parameters scaled to a 3.5-m telescope given in Table 6.20. Image sizes 
over a 1.5° diameter field are at the 0.1 arc-sec level or less. This f/12 design 


Fig. 6.13. Flat-field Korsch telescope with f /4.5 overall and external focus of diameter 1.2°. See 
Table 6.19 for the mirror parameters. 
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Table 6.19 


Parameters for Korsch Flat-Field Telescope? 





Surface R(mm) K Separation(mm) 
Primary —21000 —1.26294 

—7875 
Secondary —5833.3 —2.84322 

3365.4 

Tertiary —8076.9 —1.40148 

—3432.7 
Image oo 


°D=3.5m, F; =3.0, F = 4.5. 


features a fold mirror located at the exit pupil in the space between the primary 
and tertiary mirrors, thus giving a large accessible focal plane. Because of the 
fold mirror the center of the field is totally vignetted and a portion around the 
center is partially vignetted. Unlike the other three-mirror telescopes discussed, 
this design has a relatively large final focal ratio. Korsch points out that an 
advantage of this type of configuration is a focal surface free from stray light 
without an extensive system of baffles. This advantage is typical of what is called 
a 2-axis configuration. 

The final example presented in this section is a design by Robb (1978), a flat- 
field f/5 system with the focal plane located near the vertex of the primary, as 
shown in Fig. 6.15. Image diameters over a field spanning 2° are 0.2 arc-sec or 
smaller. The parameters given by Robb show that each of the mirrors is 
hyperbolic in cross section, with additional aspheric terms added to the primary 
and secondary. For the field size shown in Fig. 6.15 we see that the vignetting by 
the focal plane of the beam heading toward the tertiary is quite substantial. 





Fig. 6.14. Flat-field 2-axis Korsch telescope with f /12 overall and external focus of diameter 1.5°. 
See Table 6.20 for the mirror parameters. 
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Table 6.20 


Parameters for Korsch 2-Axis Flat-Field Telescope? 


Surface R(mm) K Separation(mm) 
Primary —15400 —0.969825 

—6483.4 
Secondary —2962.16 —1.739743 

10481.3 

Tertiary —3620.02 —0.558565 

—3573 
Image lee) 


°D=3.5m, F| = 2.2, F = 12. 


Fig. 6.15. Flat-field Robb telescope with f/5 overall and field diameter of 2°. 


6.5. FOUR-MIRROR TELESCOPES 


As astronomers push for telescopes larger than 10-m diameter, it is likely that 
conventional designs suitable for telescopes in the 4- to 8-m class will no longer 
be appropriate for what we will call giant telescopes. The principal reason for this 
is the expected change from monolithic primary mirrors, quite satisfactory for 8- 
m class telescopes, to segmented primaries such as in the 10-m Keck telescopes. 

Segmented mirrors can, in principle, be made for any aspheric shape, but it 
seems likely that giant segmented primary mirrors will be spherical. Although the 
Keck mirrors are parabolic in cross section, and do the job quite well, polishing 
off-axis aspheric segments to the required accuracy is a nontrivial and costly task. 
The advantages of spherical segments include ease of polishing to the required 
accuracy and complete interchangeability of segments within the primary mirror 
array. These advantages, in turn, translate into lower cost. 

Among the telescope systems considered so far in this chapter, the two-mirror 
telescope with spherical primary and zero overall spherical aberration has an 
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unacceptably small field because of large coma. For two-mirror telescopes, in 
general, two aberrations can be corrected with the proper choices of the conic 
constants on the mirrors, leading to the aplanatic designs with zero spherical 
aberration and coma discussed in Section 6.2. If, however, the primary mirror is 
spherical, then the remaining conic constant can, in general, be set to make only 
one aberration zero, specifically spherical aberration. Although two mirrors can 
be configured to correct for more than two aberrations, as in the anastigmat or 
flat-field aplanat discussed in Section 6.2, such designs are constrained to 
particular combinations of normalized parameters and are of limited usefulness. 

With three mirrors it is possible, in general, to correct for three aberrations 
with the proper choices for the conic constants. For the three-mirror telescope 
designs based on the Paul-Schmidt concept the choices are a paraboloidal 
primary and spherical tertiary with the conic constant of the secondary tailored 
to the layout of the mirrors. If, in addition, a flat field is required, then the system 
parameters are again constrained to certain combinations. Other three-mirror 
designs are similarly limited to particular combinations of conic constants and 
system parameters when more than three aberrations are corrected. As with two- 
mirror telescopes, requiring a spherical primary in a three-mirror system removes 
one variable from the parameters available for correction of aberrations and leads 
to no practical designs. 

It is for these reasons that we consider, at least briefly, some of the possibilities 
with four mirrors. In our analysis we will follow the excellent discussion on four- 
mirror telescopes by Wilson (1996), with an emphasis on the principles leading to 
practical designs of such systems with a spherical primary mirror. We will also 
consider only 2-axis designs, largely because of the problems with vignetting in 
single-axis systems. 


6.5.4. EXAMPLES 


As a starting point in an analysis leading to a practical four-mirror telescope 
we return to Table 6.13 and the aberration coefficients for afocal telescopes. The 
first three coefficients are zero when the mirrors are paraboloids, our starting 
point for the Paul-Baker designs. Consider instead choosing K, =0 and 
K, = —1 for the afocal arrangement. From Table 6.13 we see that coma and 
astigmatism are still zero although, of course, spherical aberration is not zero. 
These results are expected because the coefficients were derived with the aperture 
stop at the primary mirror, and coma and astigmatism are independent of the 
conic constant when the pupil is at the surface. Thus the beam from this modified 
afocal system has zero coma and astigmatism, but large spherical aberration from 
mirror M,. Given the nonzero spherical aberration, the exit pupil is fixed at a 
distance —k(1 — k) fi relative to mirror M}. 
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This afocal pair is now used as the feeder for a third mirror. Following the 
Paul-Schmidt concept, mirror M, is spherical and its center of curvature coincides 
with the center of the exit pupil of the afocal feeder. Mirror M}, in turn, reimages 
the exit pupil on to the fourth mirror of the system, M4, where the conic constant 
K; is chosen to zero the spherical aberration of the system. Because M, is the 
source of most of the spherical aberration, the large wavefront advance at the 
primary will be compensated by an approximately equal wavefront retardation at 
M,. Note that M, will introduce some coma and astigmatism into the design, but 
adjusting K, will not introduce additional amounts of these off-axis aberrations. 

At this point we make the crucial observation that the mirror arrangement 
described in the preceding paragraph is possible only with a 2-axis system. 
Mirror M, reimages the exit pupil back on itself when the pupil is a distance R, 
from the mirror, but the light is intercepted by M, and no light reaches the 
reimaged pupil. Hence the beam must be folded between mirrors M, and M3, with 
the obvious choice that the fold mirror be located where the beam is smallest, 
near the focal point of M, where an intermediate image is formed. A layout of 
such a four-mirror configuration is shown in Fig. 6.16. 

Following Wilson we will choose a 16-m telescope with an f/1.5 primary and 
k = 0.25. We choose to set M; at a distance 1.25 times farther from the secondary 
than is the primary. With the position of M, established, its radius of curvature 
and the position of M, are easily found. The final parameter to be set is the 
position of the final focus from which the radius of curvature of M, can be found. 
For our example we choose the magnification of the intermediate image 
m, = —2. The nominal parameters for this telescope are given in Table 6.21 
and can serve as the starting point for an optimization analysis. The value of F3 
follows from the focal ratio relation in Table 6.17, with the overall F = |m,|F;. 

The nominal value of K, in Table 6.21 is found by substituting the spherical 
aberration coefficients for M, and M, from Table 5.2 into Eq. (5.6.7) and setting 
the sum to zero. This gives 


_ 1 (Rs) (m+ 
eEG ea 


or about —13.6 for the parameters in Table 6.21. Thus mirror M, is strongly 
hyperbolic, as expected. 

The parameters allowed to vary in an optimization process include the conic 
constants and aspheric terms of sixth order and higher for M, and M4. In order to 
work on the coma and astigmatism introduced by M4, the locations of these 
mirrors relative to the secondary are also allowed to vary. The radii of curvature of 
all the mirrors and the shapes of M, and M, are generally held constant. We will 
not give the detailed results of our analysis, but simply note that the final mirror is 
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Fig. 6.16. A four-mirror 2-axis telescope configuration based on the Paul-Schmidt principle. See 
Table 6.21 for nominal parameters and discussion following Eq. (6.5.1). 


Table 6.21 


Nominal Parameters for Four-Mirror 2-Axis Telescope? 








Surface R(m) Separation(m) K 
Primary —48 0 
—18 
Secondary —12 -1 
22.5 
Tertiary —27 0 
—27 
Fourth Image 18 —13.6 
27 


“D = 16m, F, = 1.5, F; = 3.375, F = 6.75. Signs of 
radii and separations apply to unfolded 1-axis telescope. 
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strongly hyperbolic and the tertiary is approximately parabolic. Image diameters 
are 0.2 arc-sec or less over a flat field of 18 arc-min diameter. 

Wilson and Delabre (1997) have analyzed this type of system in great detail 
and point out that the secondary M, can also be made spherical rather than 
parabolic. For their specific design with K, = 0, they find K, = —0.951 and 
K, = —11.12. They point out that the second axis can be positioned favorably to 
coincide with the altitude axis of an alt-az mounting, and that two identical 
“Nasmyth-type” foci are possible. The reader should consult their papers for 
further details. 

The design in Fig. 6.16 has collimated light between mirrors M, and M3, and 
an intermediate focus and fold mirror between M, and M,. Another possibility is 
a reversal of these roles, an intermediate focus and fold mirror between M, and 
M;, and collimated light between M, and M,. A layout of this type of 
configuration with an f/1.5 primary is shown in Fig. 6.17. Wilson and Delabre 
have also examined this type of system in detail and find that excellent image 
quality is achieveable over a 30 arc-min diameter field, comparable to that for the 
configuration in Fig. 6.16. 


Fig. 6.17. A four-mirror 2-axis telescope configuration based on the Paul-Schmidt principle with 
intermediate focus between the secondary and tertiary mirrors. 
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These two examples will suffice to introduce the reader to possibilities for 
practical four-mirror telescopes with spherical primaries. Wilson (1966) discusses 
other four-mirror configurations; the interested reader should consult his text for 
specifics. 


6.5.b. PUPIL ALIGNMENT 


In Section 6.3 we discussed the consequences of a misaligned secondary 
mirror in a two-mirror telescope and showed that an error in its position (tilt, 
decenter, and/or despace) introduced aberrations. In this section we consider the 
consequences of a misaligned pupil and show that aberrations are again the result. 
In the case of a four-mirror telescope with its entrance pupil at M}, the pupil is 
misaligned if the optics between mirrors M, and M, do not properly image the 
entrance pupil on to M,. Pupil misalignment is, of course, a consequence of the 
incorrect placement of the optics between M, and M4 , but our emphasis here is 
on the misalignment of the pupil, not on the error in location of preceding optics. 

We assume that the error at the entrance pupil is entirely fixed third-order 
spherical aberration (SA3) and not a more complicated type of wavefront error, 
either static or dynamic. (Correction of dynamic wavefront error is discussed 
under the headings of active and adaptive optics.) The assumption that the error is 
entirely SA3 is often true in practice. The best known case is that of the error in 
the Hubble Space Telescope (HST) primary where an undetected error in an 
optical test fixture was propagated into a surface error on the mirror. Numerous 
other primary mirrors have also had residual errors of this type that went 
undetected until put into operation in telescopes. In the case of HST, the “fix” 
was put on an optical element at a reimaged entrance pupil and careful attention 
was paid to possible pupil misalignment. A discussion of HST in the context of 
this section follows our general analysis. 

We begin by designating the entrance pupil by Xp» and the exit pupil by X. We 
also assume that the pupil imaging optics has negligible spherical aberration, 
hence the amount of SA3 wavefront error added to the shape of the optic at È is 
equal in magnitude but opposite in sign to that present at Xp). The two effects we 
consider quantitatively are pupil magnification and pupil shear, with a brief 
qualitative discussion of pupil aberration. Our analysis of these effects parallels a 
detailed discussion of this topic by Meinel and Meinel (1992). 

Let Q and Q denote the magnitudes of the wavefront errors added at the exit 
pupil 2 and reimaged by the intermediate optics, respectively. The general forms 
of these errors for SA3 are 


Q = Ay’, Q = Ay". (6.5.2) 


The difference between the wavefronts is the residual error AQ. For an aligned 
pupil y’ = y and there is no residual error. 
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Pupil magnification occurs when the reimaged entrance pupil has a different 
size than the wavefront error added at È. In this case y’ = (1 + e)y and the error is 


AQ,, = Ay* — A(1 + 8)*y4 = —4Aey4 
= —42Q, (6.5.3) 


plus terms in higher powers of e that are negligible for € < 1. The residual error 
in this case is spherical aberration. Note that the size of the residual error is 
proportional to the error at the entrance pupil. In a four-mirror telescope this 
effect is controlled by proper spacing of mirrors M, and M, with their as-built 
radii of curvature. 

Pupil shear occurs when the reimaged entrance pupil is decentered on the exit 
pupil. In this case y’ = y + dy and the residual error for pupil shear is 


AQ, = Ay* — Aly + ôy = —44y* ôy 
= —40(dy/y), (6.5.4) 


plus terms in higher powers of dy that are negligible for dy < y. The residual 
error in this case has the form of coma, which we designate by CM3. Note that 
larger spherical aberration at the entrance pupil requires a tighter tolerance on 
pupil centering for the same coma residual. Coma due to pupil shear is constant 
across the image field. 

As examples of the tolerance of pupil misalignment we consider two 
examples, the nominal four-mirror telescope in Fig. 6.16 and Table 6.21, and 
HST with the two-mirror correction system called Corrective Optics, Space 
Telescope Aberration Recovery (COSTAR). 

From Table 6.21 we get R; = —48 m and y, = 8 m at the edge of the primary, 
giving a surface error of 9.26mm or about 1.463E4 waves at a wavelength of 
633nm. Assuming an allowable surface error of six waves in Eqs. (6.5.3) and 
(6.5.4) we find the following approximate limits: |e] < 1E-4, |dy/y| < 1E-4. The 
first of these limits leads to the requirement that the ratio of the actual to the 
expected magnification differs from unity by no more than ||. The second limit 
gives a tolerance on pupil shear of 0.0001y at mirror M4, or about 0.2 mm. The 
residual error of six waves gives angular coma of about 0.3 arc-sec, a reasonable 
limit for a ground-based telescope. Given the tight tolerance on dy in a telescope 
of this size, active monitoring and control of pupil shear would be necessary. 

Turning now to HST, the COSTAR system of two mirrors was an addition to 
the original telescope following the recognition that the HST primary had the 
wrong conic constant. The COSTAR mirrors M) and M, were designed to act 
much like mirrors M, and M, in a four-mirror telescope, with M, reimaging the 
HST exit pupil on to M,. In the case of HST, however, the COSTAR mirrors are 
smal! compared to the HST primary, with each mirror about 25 mm in diameter. 
The surface error at the edge of the HST primary is about 2.2 um or about 3.5 
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waves at À = 633 nm. The wavefront error, in turn, is about 7 waves. Given the 
requirement that the corrected HST should be diffraction-limited leads to an 
allowable residual wavefront error of about 0.2 waves. Putting these numbers into 
Eq. (6.5.4) gives a tolerance of +90 um for pupil shear. Because of this stringent 
requirement, mechanisms for aligning the mirrors were an essential part of the 
COSTAR package. 

The two effects considered so far assume that the reimaged pupil is at the 
surface of the optical element correcting SA3 at the entrance pupil. If the 
reimaged pupil lies in front of or behind the correcting surface, then there is 
no longer a one-to-one correspondence between points on the entrance pupil and 
correcting surface. In this case rays from a single point on the entrance pupil land 
on an area on the correcting surface, a larger area for a wider field of view, and the 
best that can be done is to provide an average correction. Coma and astigmatism 
will also enter into the analysis when the projected entrance pupil is not at the 
correcting surface. 

The discussion in this section is intended to illustrate the consequences of 
pupil misalignment when a wavefront error at the entrance pupil is corrected 
farther along in the optical train. We have considered only the simplest error to 
correct, that of SA3. If the primary mirror is segmented rather than monolithic, 
then piston or tilt errors of individual segments must also be considered. The 
interested reader should consult the article by Meinel and Meinel (1992) for a 
discussion of these effects. 


6.6. CONCLUDING REMARKS 


The discussion of the image characteristics in this chapter is based entirely on 
the geometric theory derived with the aid of Fermat’s Principle, without taking 
into account the limit set by diffraction. Characteristics of images in the 
diffraction limit where geometric aberrations are negligible is discussed in 
detail in Chapter 10. The relations in this chapter are derived assuming the 
mirror surfaces are essentially perfect, thus the figure on the surface of each 
mirror is according to the prescription given by Eq. (5.1.1). Real mirrors are not 
perfect and polishing errors give rise to scattered light and image degradation. We 
discuss this topic in Chapter 18. 

We have devoted most of our discussion to two-mirror telescopes because 
nearly all large reflectors are of this type. It should be evident, however, from our 
discussion of the Paul-Baker designs that families of three-mirror telescopes with 
excellent image characteristics can be found, given the additional free parameters 
with another mirror. Although many three-mirror designs have been published, 
they have common problems of image surface accessibility and larger vignetting 
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of the incident beam, compared to two-mirror telescopes. With careful attention 
given to these problems, however, practical three-mirror designs with excellent 
image characteristics can be found. 

Although practical configurations of four-mirror telescopes have been 
proposed, none of these designs have yet been built. The push to build giant 
ground-based telescopes with moderate fields and excellent image quality makes 
this type of telescope a viable contender to more conventional designs. 

Finally, there are innovative telescope designs not discussed here. One of these 
is the 9-m Hobby-Eberly telescope with a segmented spherical primary and a 
four-mirror, all-reflecting Gregorian corrector located at prime focus. The 
corrector removes the very large spherical aberration of the f/1.45 primary. 
Another design not considered is the 6.5-m replacement for the Multiple-Mirror 
Telescope. This telescope has an f/5 Cassegrain focus at which a field diameter 
of 1° is obtained with an all-refractive corrector located near the Cassegrain 
focus. Both of these telescopes are primarily used with fiber-fed spectrometers. 
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Chapter 7 Schmidt Telescopes and Cameras 


Typical ground-based two-mirror telescopes without correctors have usable 
field diameters of a fraction of a degree. In Chapter 9 we show that larger fields 
are obtained with the addition of a corrector system to such telescopes, reaching 
about 1° at prime focus and up to 3° at Cassegrain focus. Still larger fields require 
a telescope of the Schmidt type, or one of the many members of the family of 
telescopes based on the principle of the Schmidt. This principle is basically one of 
using a corrector plate to compensate for the spherical aberration of the reflecting 
optics and locating the plate and aperture stop to give zero coma and astigmatism 
for the system, at least to third order. 

In this chapter we consider in more detail the classical Schmidt system first 
introduced in Chapter 4, including solid and semisolid Schmidt systems in which 
all or part of the air between the optical surfaces is replaced by glass. We discuss 
derivatives of the Schmidt design, such as Schmidt-Cassegrain and Bouwers- 
Maksutov systems in Chapter 8. 

The classical Schmidt is the choice for a wide-field telescope if an aperture of 
1m or more is required. The principal reasons are its relative simplicity, only two 
large optical elements, and the smaller chromatic aberration of the aspheric 
corrector compared to that of the corrector in other types. In smaller apertures the 
choices for a wide-field instrument are a folded Schmidt or one of the two-mirror 
types. Whether the intended use is as a spectrograph camera or a telescope for 
visual observation, the requirement of an accessible focal surface is of overriding 
importance in this case. 


164 


7.1. General Schmidt Configuration 165 


7.1. GENERAL SCHMIDT CONFIGURATION 


The Schmidt camera in its usual configuration is a corrector plate located at 
the center of curvature of a spherical mirror, as shown in Fig. 4.10. This 
arrangement was discussed in Section 4.5, where it was introduced to illustrate 
the application of Fermat’s Principle to cancel the on-axis aberration of the mirror 
in collimated light. The importance of locating the aperture stop at the center of 
curvature of the mirror to eliminate off-axis aberrations was also noted there. 

In this section we extend these discussions and consider the Schmidt config- 
uration in a more general way. This is done to show the range of possibilities for 
placement of the aperture stop and corrector. 

Consider the system of spherical mirror, corrector plate, and aperture stop 
shown in Fig. 7.1, with the object surface at distance s to the left of the mirror. 
The corrector plate is located a distance d to the left of the mirror, and the 
aperture stop is distance g to the left of the corrector. The distances s, d, and g in 
Fig. 7.1 are negative according to the sign convention. We choose n, to denote the 
index of the medium for the rays incident on the mirror and reserve n and n’ for 
the media before and after the aspheric correcting surface. 

Defining k = y,/y,, the ratio of the beam height at the mirror to that at the 
corrector, we see from Fig. 7.1 that k = s/(s — d). 

The aberration coefficients for the corrector and mirror are found in Table 5.5 
and Table 5.6, respectively, with only the b terms taken for the corrector. 
Substituting these results into Eq. (5.6.7) to get the system coefficients gives 








2 
n= 4] 0-Fe(E4)) | (7.1.1) 
1 
ste} os 
2 2 
a= Elo - (1-5) | (7.13) 


where W = d + g. From the system aberration coefficients, and the requirement 
that each be zero, we can determine what freedom, if any, there is in their 
locations. 

Setting Eq. (7.1.1) to zero, putting the result for b into Eqs. (7.1.2) and (7.1.3), 
and setting each equal to zero gives the condition, 


l/m+1 1 W 
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Fig. 7.1. Schmidt camera with stop at distance g from corrector and object at distance s from 
mirror. 


Using the relation between R and m in Table 5.2 and substituting for W and k in 
terms of s, d, and g, we find 


g(R—s) = (s ~ d)\(R — d —g). 


Solving this equation for d gives two solutions: d = R and d = s — g. The first of 
these solutions places the corrector at the center of curvature of the mirror, the 
same location as in the earlier discussions. The second solution gives W = s, 
hence the stop is at the object surface. This result is untenable and is discarded 
because it violates the condition that y& is small, as is evident by putting W = s 
into Eq. (5.5.2). 

With d = R, hence W = R + g, we find k = —(m — 1)/(m + 1), and therefore 
the aspheric factor is 


2n, {m—1 i 


For collimated light m = 0 and b = 2n,/R?, the result given in Section 5.5. For 
the configuration shown in Fig. 7.1, n, = 1, m < 0 and |m| < 1. Thus the factor 
in parentheses in Eq. (7.1.5) is larger than one, and b for noncollimated light is 
larger than for collimated light. 

The upshot of this analysis is that for either collimated or noncollimated light 
the corrector plate must be located at the center of curvature of the mirror, but the 
location of the stop is arbitrary, provided W//s is not close to unity. Note that if an 
optical system precedes the Schmidt camera, the stop is the exit pupil of the 
preceding system. 

This result is important because in some configurations using a Schmidt 
camera the stop or pupil is necessarily displaced from the corrector. An example 
of this is a camera in a spectrograph where the pupil is usually at the prism or 
grating and different wavelengths leave the dispersing element in different 
directions. It is worth noting here that when the stop is displaced from the 
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corrector, the corrector is larger and its chromatic effects are also larger. To 
minimize the chromatic effects, therefore, the pupil should be at the corrector or 
as close as can be arranged. We discuss the relation between the pupil location 
and the chromatic effect in a following section. 


7.2. CHARACTERISTICS OF ASPHERIC PLATE 


The aspheric plate is obviously the key to a correctly configured Schmidt 
system and we now consider its aberration characteristics in some detail. In this 
section we consider the finite thickness of a real plate and its effect on the 
aberrations, and the effect of the radius term introduced in Section 4.5 to 
minimize the chromatic aberration of the plate. We also discuss chromatic 
aberration in more detail than in Section 4.5, and give relations for fifth-order 
spherical aberration of an aspheric plate and spherical mirror in collimated light. 

The equation for an aspheric surface is given by an extension of Eq. (5.1.1), 
with K = —1, as follows: 


= 





a ee ee 
= —_ + Fr’ + Fr’, 7.2.1 
R 8G a oa Oe r teh) 


where the latter form in Eq. (7.2.1) is that usually used in ray-tracing programs. 
We include the terms in r° in anticipation of the section on fifth-order spherical 
aberration. 

The difference between setting K = —1 versus K = 0 is of no practical 
consequence for a refracting plate. If K =0 the added terms, r*/8R? and 
r®/16R°, are each several orders of magnitude smaller than the terms in b and 
b', respectively, for any practical plate. For the corrector plate example in Section 
7.3, the effect of these added terms is to change the thickness at the margin by 
less than 0.3 nm. 


7.2.a. CHROMATIC ABERRATION 


One approach to finding the chromatic properties of an aspheric plate is given 
in Section 4.5.b, where the analysis gives a relation for the minimum chromatic 
spherical aberration in Eq.(4.5.15). In this section we determine the chromatic 
properties in a more general way, including the effect of a stop displaced from the 
aspheric plate. 
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We first consider the Schmidt system shown in Fig. 4.10 with the aspheric 
figure on the surface facing the mirror. The plate of index n is located in air. For a 
plate of radius rọ the profile of the aspheric surface can be written as 


1 


ag bee 
(n—1)z F 


(r* — arr), (7.2.2) 





where f = —R/2 and a is an arbitrary parameter. Note that this relation is simply 
Eq. (4.5.11) written with a/4 replacing 3/8 in the term containing 74. 

For a ray parallel to the z-axis, the angle of deviation ô at height r at the 
aspheric surface is given by 6 = i(n — 1), as shown in Fig. 7.2. The angle of 
incidence i at the aspheric surface is i = —dz/dr, where dz/dr is the slope of the 
normal to the surface. Thus 


dz 1 


(arr — 2r°). (7.2.3) 
From Eq. (7.2.3) we see that ô = 0 when r = 0 and r = ro./a/2, where the latter 
value defines the radius of the neutral zone. 

Inside the neutral zone the ray deviation is a maximum at the inflection zone, 
defined as that r for which dô/dr = 0, while outside the neutral zone r is a 
maximum at the edge of the plate. The characteristics of the aspheric surface at 
these zones, expressed in terms of a, are given in Table 7.1. 

From the entries in Table 7.1 it is evident that the deviations at the inflection 
zone and edge have opposite signs for a < 2. As a increases from zero, 6 at the 
inflection zone increases while 6 at the edge decreases. The net deviation across 
the plate is a minimum when the values of 6 at these two radii are equal in 
magnitude. This is obtained with the choice a = 1.5 and the resulting magnitude 
of ô = 73/32f° at these radii. The neutral zone is then at r = ry/3/2 = 0.86679. 





Fig. 7.2. Angle of deviation 6 at wedge-shaped section of aspheric plate. 
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Table 7.1 


Deviations at Zones of Asperic Plate 





Zone Radius Deviation 
d ‘ay '/2 rọ ay 3/2 
Inflection ro 6) 77 6) 
an 1/2 
Neutral Yo G) 0 
Edge r, D (a — 2) 
g 0 I 6f? 


Differentiating Eq. (7.2.1) and setting it equal to zero, substituting for r at the 
neutral zone, and solving for R, gives 


_ 8@ =n) | 1 


Pen l Mue 
c 3br? 3Er, 


(7.2.4) 


where n’ = 1 for the configuration in Fig. 4.10. 

The sign of b depends only on the character of the plate. From Eq. (7.1.5) we 
see that b < 0 for a plate in a Schmidt camera because n, and R are always of 
opposite sign. As shown in Section 4.5.b and Fig. 4.11, a Schmidt plate has a 
“turned-up” edge. Conversely, b > 0 for a plate with a “turned-down” edge. The 
sign of E, on the other hand, depends on whether the aspheric is on the first or 
second surface of the corrector, and on the direction of light through the plate. 
Note also from Eq. (7.2.4) that E and R, always have opposite signs in order to 
place the neutral zone at the desired radius. 

The chromatic blur is obtained by finding the variation of ô with changing n. 
Using Eq. (7.2.3) we get 


ds dz 6 


dn dr n-1' 





(7.2.5) 


Figure 7.3 shows two rays for different values of n leaving a point on the aspheric 
surface and intersecting the mirror a distance R dô apart. The point on the 
aspheric surface can, in effect, be considered an object point at distance R that is 
reimaged at the corrector. Hence the blur at the focal surface for these two rays is 
f do. Substituting the values of 6 at the inflection zone and edge into Eq. (7.2.5) 
gives a blur diameter of 2f dô. Hence 


f dn 
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Fig. 7.3. Paths of rays of different wavelength through Schmidt camera. See Eqs. (7.2.5) and 
(7.2.6). 


where CSA is the chromatic spherical aberration and F is the focal ratio of the 
mirror. This result is, as expected, equivalent to that given in Eq. (4.5.15). 

The results so far are appropriate for a corrector with aperture stop at the plate. 
When the stop or pupil is displaced from the plate, as shown in Fig. 7.4, the 
radius of the plate must be larger by a factor I to accept all of the light without 
vignetting. If rọ is the radius of a collimated beam at the plate, then 
T = 1+ W6/rg, and all of the results in Table 7.1 apply to the enlarged plate 
if rọ is replaced by Iro. 

The chromatic effects are again minimized by choosing a = 1.5, hence the 
neutral zone is at r = 0.866(I ro). The deviations at the inflection zone and edge 
are now larger by the factor I” 3. as is the blur diameter in Eq. (7.2.6). It is clear 
from this result that the placement of the stop or pupil at the corrector is the 
preferred choice to minimize chromatic effects. As a final item note that the 
relations for R, in Eq. (7.2.4) apply to a plate of radius Iro with the substitution 
of Iro for ro. 





Fig. 7.4. Corrector size required to cover field when stop is displaced from plate. 
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7.2.b. ABERRATION COEFFICIENTS 


We now determine the effect of the plate thickness and radius R, on the 
aberration coefficients of a real plate in collimated light. Consider a plate of index 
n and thickness ¢, as shown in Fig. 7.5. Its first surface is plane and its second 
surface has a radius of curvature R, and an aspheric term b, with the pupil for the 
plate located a distance W, to the left of the first surface. The image of the pupil 
by the first surface is Wj from this surface and W, from the second surface. 
Because collimated light is incident on the plate, the beam heights at the two 
surfaces are equal. 

The aberration coefficients for the first surface are zero when the light is 
collimated. The coefficients for the second surface from Table 5.5 are 


TE Enh BY brady) 
Ba = "BE (1-32) +5 Orel) (72.1) 
Ba =— ah r=) 
where 
Waa’, Wy = Wi —t=nW, -t. (7.2.8) 





Fig. 7.5. Cross section of aspheric plate with stop AS and pupil EP. See Eq. (7.2.8). 
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Substituting Eq. (7.2.8) into Eqs. (7.2.7), and assuming that W, < R, for all 
configurations using an aspheric plate, we find the following aberration coeffi- 
cients for the corrector: 


@(n—1) be ty? 
Bie = oR ty (a as 
_ On(n—1) bO t 
Bi, = ORE t5 (m 7 =), (7.2.9) 
b (n-i) 
OTa aT i 


Substituting Eq. (7.2.4) for one R, in B}, of Eqs. (7.2.9), we find that the right- 
hand term is of order (rg IRS smaller than the first term. For any practical plate 
ro & R. and the right-hand term in B3, is negligible. 

For the remaining coefficients in Eq. (7.2.9) we see that the contribution of the 
aspheric term in each is zero when W, = t/n. The remaining terms are simply 
those for a plano-convex lens in collimated light. If Eq. (7.2.4) is substituted into 
B,, and B,,, and W, — t/n is replaced by g, the coefficient B,, is dominated by the 
term in R, when e is small. For B,., on the other hand, the term in e dominates 
when |e| > r /R,, hence the coma coefficient is sensitive to small changes in €. 
This result can be used to minimize the effect of coma in a Schmidt system by 
adjusting the plate location. 

Relations of a comparable form to those in Eq. (7.2.9) are obtained for an 
aspheric plate with its figured surface facing the incident light, with the principal 
change one of substituting W, for W, —t/n. The comments in the previous 
paragraph on the dependence of B,. and B}, on small values of W, hold without 
change. 

If the incident light is coming from a source at a finite distance, then there is an 
additional contribution to each of the aberration coefficients from the plate 
thickness. These effects are easily derived with the aid of the geometry in Fig. 7.6 
for a plane-parallel plate, with the coefficients for each surface taken from Table 
5.1 with b = 0 and R = oo. When these relations are substituted into Eq. (5.6.7), 
an exercise left to the reader, we find for a plate p 


(n? — 1)t 
Bp = ge (7.2.10) 
6, (n? fa 1) 
poer ee 7.2.11 
2p 2n3s3 ( ) 
2 
-1 
E A (7.2.12) 
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Fig. 7.6. Cross section of plane-parallel plate of index n in air. 


The importance of these coefficients for an aspheric plate in noncollimated light 
depends on the specifics of a given configuration. In most configurations it turns 
out that their contributions are of little significance, with the details best left to 
computer ray-trace analysis. 

Although the term in R, in B3. of Eq. (7.2.9) is negligible, the value of R, does 
affect the optimum choice of b required to zero the third-order spherical 
aberration of the system. With the addition of a radius term the corrector 
becomes, in effect, a weak positive lens with an aspheric figure. The effect of 
the lens part of the corrector is to convert the incident collimated light into a 
slightly converging beam. Thus the marginal rays intersect the mirror at a slightly 
smaller distance from the mirror vertex, as compared to the case where the 
corrector has no radius term. Omitting the details of the derivation, the spherical 
aberration coefficient for a Schmidt system in collimated light is given by 


b 3 ry? 
Bs = 2+ it | -5(@) | (7.2.13) 





4R3 R 


for the case where R, is chosen according to Eq. (7.2.4). Setting B3, equal to zero 


gives 
2n, 3 rN? 2n, 3 
b= lı 5 (2) | =F f zal (7.2.14) 
For typical values of F the result is a reduction of 1 or 2% in the magnitude of b 
needed to cancel the spherical aberration of the mirror. 

With the exception of the correction given by Eq. (7.2.14), the effects of the 
plate thickness and radius of curvature on the aberration coefficients of an 
aspheric plate are usually small compared to those of terms containing b. 
Therefore the usual approach to the analysis of a system that includes one or 
more aspheric plates is to include only terms in b and let this description serve as 
the starting point for a ray-trace analysis. 
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Table 7.2 


Fifth-Order Spherical Aberration Results? 


Surface B; ASA5 
: b 3b ; 
Aspheric STe -g 


5 
Spherical mirror ae e (3) 





“Results valid for collimated light only. 
ASAS = 6Bsr°. 


7.2.c. FIFTH-ORDER SPHERICAL ABERRATION 


The prescription of a Schmidt corrector plate usually includes higher-order 
aberration terms. Fifth-order spherical aberration is the most significant of these 
terms and we give here, without derivation, the necessary relations for a spherical 
mirror and aspheric corrector in collimated light. 

The fifth-order spherical aberration coefficients, denoted by Bs, are obtained 
after a lengthy analysis paralleling that in Chapter 5, but with 0 set equal to zero. 
The results of this analysis are given in Table 7.2, with the entry for ASAS of the 
mirror derived from Eq. (4.2.1). 

The calculation of the system aberration coefficient Bs, for the combined 
aspheric and spherical mirror is carried out using Eq. (5.6.7) with j = 5. Because 
the ray heights at the corrector and mirror are equal in a first approximation, Bs, is 
simply the sum of the coefficients in Table 7.2. Setting the sum equal to zero 
gives b’ = 6n,/R; with this choice of b’ the fifth-order spherical aberration of the 
system is zero. 


7.3. SCHMIDT TELESCOPE EXAMPLE 


We now apply the preceding results to an example of a 1-m Schmidt telescope 
with F = 2.5. The aspheric surface on the corrector plate faces the mirror; the 
plate material is SiO, and its thickness is 10mm at the vertex. The parameters 
R., E, and F in Eq. (7.2.1) are calculated at A = 548 nm, at which wavelength the 
plate index is 1.460. Values of the telescope parameters are given in Table 7.3, 
with b given both for R, = co and according to Eq. (7.2.14). The depth of the 
corrector at the neutral zone, calculated from Eq. (7.2.1), is 0.1534 mm. 

Results from a ray-trace analysis are given in Table 7.4, with all aberrations 
given in angular terms in units of arc-seconds. Various combinations of 
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Table 7.3 


Parameters of 1-m Schmidt Telescope? 


R = —5000 mm ro = 500 mm 
b =2/R? = —1.60E-11 E = 4.34785E-12 
b! = 6/Rî = —1.92E-18 F = 2.6087E-19 


R, = —1/3Er, = —306 667 
From Eq. (7.2.14) 
b = -1.576E-11 E = 4.2826E-12 


“Values of E and F are computed using n = 1.46. 


parameters from Table 7.3 are used to illustrate the effect of each of the 
parameters on the angular aberrations. Note that the on-axis angular aberrations 
of the mirror without corrector, given in arc-seconds, are ASA3 = 206.3, and 
ASAS = 4.64, while the off-axis aberrations are zero in the third-order approx- 
imation. 

Examination of the results in Table 7.4 clearly shows the improvement in the 
on-axis image quality when a fifth-order term is included in the aspheric and the 
third-order aspheric term is calculated from Eq. (7.2.14). We also see that there 
are small but nonzero off-axis aberrations that appear when the radius term is 
included on the corrector. These aberrations are a result of the terms in R, in Eq. 
(7.2.9). The presence of these off-axis aberrations limits the field size, and ray 
traces of the final system in Table 7.4 gives an image blur diameter of about 1 arc- 
sec at a field angle of 3.5°. 

Values for the angular chromatic spherical aberration, computed from Eq. 
(7.2.6), are shown in Table 7.5, where the indices are those of Si0,. Because the 
index of refraction rises more steeply at shorter wavelengths, the chromatic blur 
increases rapidly for blue and ultraviolet wavelengths. 


Table 7.4 
Ray-Trace Results for 1-m Schmidt Telescope”? 


System Parameters 





=b —b' =R. ASA3 ASAS ATC AAS 
1.60E-11 0 oo <0.01 4.76 0.000 0.000 
1.60E-11 0 306667° 3.09 4.69 0.010 0.048 
1.60E-11 1.92E-18 306667 3.09 0.02 0.010 0.048 


1.576E-11 1.92E-18 3066677 0.01 0.02 0.010 0.048 


“Telescope scale = 82.5 arc-sec/mm or 12.1 m/are-sec. 

? Ray traces at À = 548 nm with 8 = 1°. Angular aberrations are given in arc-seconds. 
€ Shift of 9.38 mm from paraxial focus; see Af in Eq. (4.5.8). 

4 Shift of 9.41 mm from paraxial focus. 
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Table 7.5 
Image Diameters for 1-m Schmidt Telescope 
A (nm) n ACSA? 
350 1.47689 3.78 


400 1.47012 2.26 
450 1.46577 1.24 
548 1.46000 0.00 
650 1.45650 0.79 
700 1.45523 1.08 


7 Image diameters are given in arc-sec. 
? ACSA = angular chromatic spherical aberration. 


From the results in Table 7.5 it is evident that a single corrector does not give 
good images at all wavelengths over an extended spectral range. One alternative is 
to have several correctors, each designed to give good images over a selected 
range of wavelengths. Although this option is practical for a small telescope or 
camera, it is not considered practical for a Schmidt telescope of the 1-m class. 

A different alternative, suggested by Bowen (1960), is to design the corrector 
for a wavelength near the short end of the desired range and to use a flat glass 
plate of appropriate thickness to partially correct the chromatic aberration of the 
corrector at longer wavelengths. This plate, usually a filter to remove shorter 
wavelengths, is placed in the converging beam close to the focal surface. For 
details on this approach the reader should consult the reference by Bowen. A final 
alternative is to use an achromatic corrector made of two different glasses, the 
subject of the next section. 

The Schmidt telescope example in this section is intended primarily to 
illustrate the application of the theory to the design of a wide-field telescope. 
The focal surface is curved and further refinement of the design might include the 
addition of a field-flattener lens, as discussed in Section 5.7. Such a lens will 
introduce spherical aberration over the entire field and coma near the edge of the 
field, hence the parameters of the corrector will have to be adjusted to get an 
optimum system. The process of optimization is best carried out with a computer 
ray-trace program and will not be pursued here. For a theoretical discussion of the 
aberrations of a field-flattened Schmidt camera the reader should consult the 
reference by Linfoot (1955). 

A final point worth noting is the increasing importance of higher-order 
aberrations for smaller focal ratios. The importance of fifth-order spherical 
aberration is evident in our example, but in faster cameras it is necessary to 
consider the effects of still higher orders. In addition, fifth-order off-axis 
aberrations become important and attention must be given to their effects in 
the design of a fast camera. 
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7.4. ACHROMATIC SCHMIDT TELESCOPE 


The wavelength range over which a Schmidt telescope with a single element 
corrector gives images of acceptable size is set by the dispersive characteristics of 
the corrector. This range can be extended by replacing the single-element plate 
with a two-element corrector, with each element a glass of different dispersive 
characteristics and plate parameters to make the combination achromatic. In this 
section we outline the procedure for making an achromatic corrector and apply 
the results to an example of a 1-m Schmidt telescope. 

A cross section of a two-element corrector is shown in Fig. 7.7, with the plane 
surfaces of the elements in contact and the aspheric surfaces facing outward. The 
differential deviation for each element is given by Eq. (7.2.5), which can be 
written as 


do; = 6,/V;, dô, = ĝ,/ Vz, (7.4.1) 
where 
V= wale V, = Onr (7.4.2) 
nn ny — Ny 


In Eq. (7.4.2) V is the Abbe number and {n) is the mean of the indices in the 
denominator for each glass. The primed indices are taken at a shorter wavelength, 
by convention, hence V is positive. 

The achromatic condition requires that dd, = —dd,, hence a change in 
deviation with wavelength in one glass is compensated by a change of opposite 
sign in the other glass. Therefore 


6,/V, = —ô,/ V3. (7.4.3) 


ny | Ne 


Fig. 7.7. Cross section of portion of achromatic corrector. The net deviation ô is the sum of the 
deviations of individual elements. 
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Because the Abbe numbers are positive, the deviations of the two elements are 
opposite in sign and one plate has a “turned-up” edge, as for a normal Schmidt 
plate, while the other has a “turned-down” edge, as shown in Fig. 7.7. 

Assuming that the deviation at the plane interface is negligible, the net 
deviation of the achromatic plate is 





ô= ô; +ô = ô (2 = 2), (7.4.4) 

Vi 
Given that the achromatic plate is a replacement for a single plate, the deviation 
given by Eq. (7.4.4) must be the same as that in Eq. (7.2.3). 

From Eqs. (7.4.3) and (7.4.4) we see that both 6, and ô, are larger than 6 in 
magnitude, hence each aspheric surface has a larger local slope than on the single 
plate at the same ray height. If the achromatic plate is oriented as shown in Fig. 
7.7, 6 and 6, have opposite signs and from Eq. (7.4.4) we find V; < V}. Thus the 
element with the turned-up edge is the one with the larger Abbe number, a result 
true for either orientation of the corrector. 

It is evident from Eqs. (7.4.3) and (7.4.4) that the ratios 6,/6 and ô,/ô are 
independent of ray height r for a given set of glasses. Substituting Eq. (7.2.1) into 
Eq. (7.2.3) we see that each 6 has the form 


ô; = —(n,; — (er + 4E,P + 6F,r°), (7.4.5) 


where c; is the vertex curvature. If the ratio of one 6 to another is independent of 
r, then it follows that the ratios of corresponding plate parameters must also have 
a common value. Substituting Eq. (7.4.5) into Eq. (7.4.4) and applying this 


condition gives 
Ct His By LAA GA (7.4.6) 
c E F \(m)-1\),-V,) = 


where the unsubscripted parameters are those of the single element corrector. 
Note the reversed order of the factors in the difference of the Abbe numbers 
between Eqs. (7.4.4) and (7.4.6), a consequence of the sign difference between 6 
and 6,. Using Eq. (7.4.3) we find 
C2 EF, F, m munca 
cq E Fy nm 











(7.4.7) 


All of the relations needed to specify an achromatic plate are now in hand, and 
their application is straightforward once a suitable pair of glasses is chosen. 
We choose two glasses from the Schott catalog, UBK7 and LLF2, the former a 
crown glass and the latter a light flint. Both glasses have good internal 
transmittances in the near ultraviolet, with values of 0.85 and 0.74 at 
A = 320 nm for a 10-mm thickness of UBK7 and LLF2, respectively. The pair 
of chosen wavelengths at which to make the plate achromatic are 320 and 


7.4. Achromatic Schmidt Telescope 179 


880 nm, with the indices at these wavelengths and Abbe numbers shown in Table 
7.6. 

Given the results in Table 7.6 and the discussion following Eq. (7.4.4), LLF2 
and UBK7 are the glasses for elements 1 and 2, respectively, of the corrector 
shown in Fig. 7.7. The mean index (n) in Table 7.6 is approximately the index at 
A = 420 nm for each glass. We use this wavelength to calculate the parameters of 
a single element SiO, plate needed in Eq. (7.4.6). 

The Schmidt telescope used in the following comparison is the same one used 
in the previous section, with Eqs. (7.4.6) and (7.4.7) used to calculate the 
parameters of the achromatic plate. The calculated parameters for both the single 
and achromatic plate are found in Table 7.7. The sags at the neutral zone for the 
LLF2 and UBK7 elements are 0.2765 and 0.4268 mm, respectively. 

Ray traces of a 1-m Schmidt telescope with an achromatic plate specified by 
the parameters in Table 7.7 show a well-corrected system at 320 and 880 nm, with 
the blur diameter on-axis set primarily by residual fifth-order spherical aberration. 
The blur diameters for on-axis images over the range 320—1000 nm is shown by 
the solid curve in Fig. 7.8. Although the correction is excellent at the ends of the 
range shown, the image diameters in the blue and near ultraviolet are larger than 
desired. 

The corrector as specified provides the proper correction at the chosen 
wavelengths, but gives too large a correction over much of the range. This is 
easily remedied by making the aspherics on each surface slightly weaker. The 


Table 7.6 
Indices and Abbé Numbers for UBK7 and LLF2° 
n (320nm) n (880 nm) n—n n—-1 V 


UBK7 1.54634 1.50935 0.03699 0.52784 14.27 
LLF2 1.58789 1.53081 0.05708 0.55935 9.799 


“Indices of refraction taken from Schott catalog. 


Table 7.7 


Parameters of Single and Achromatic Correctors 


R, E F 
siog —312067 4.2085E-12  2.5636E-19 
LLF2 —170120 7.7201E-12  4.7026ẸE-19 
UBK7 —110240 1.1913E-11  7.2567E-19 


“Parameters for SiO, plate are similar to those in Table 
7.3, but computed with n = 1.46810. 
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Fig. 7.8. Image diameters for f /2.5 achromatic Schmidt camera. Solid curve: parameters in Table 
7.7; dashed curve: parameters adjusted as noted in text, Section 7.4. 


dashed curve in Fig. 7.8 shows the image diameters when the values of E and F 
for the elements of the achromatic plate in Table 7.7 are reduced by 0.25%, with 
the values of R, increased by the same amount. The overall improvement in on- 
axis image quality over much of the range is evident from a comparison of the 
two curves. 

The quality of the off-axis images is acceptable for the modified corrector, 
provided it is moved about 30mm away from the mirror. This shift reduces the 
coma to near negligible levels and ray traces give symmetrical images of 
acceptable size over a field diameter of 6°. Spot patterns are shown in Fig. 7.9 
at five wavelengths and three field angles, in addition to the images on axis. 

As in the design of any Schmidt system, computer optimization is used to 
balance the various aberrations and find the best overall set of parameters. For a 
discussion of this process and the results found for an f /3.5 achromatic Schmidt, 
the reader should consult the reference by Buchroeder (1972). Results for a 
Schmidt camera in an echelle spectrometer are given in the reference by 
Schroeder (1987). 

In summary, the Schmidt telescope with an achromatic corrector has the 
advantage of an extended wavelength range over which good images are 
obtained. With the availability of several glasses that transmit well into the 
ultraviolet, the choice of an achromatic corrector over a standard one is a viable 
option. 
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Fig. 7.9. Spot diagrams for 1-m f /2.5 achromatic Schmidt camera at selected wavelengths and 
field angles. Scale bar at the upper left is 2 arc-sec long. See Section 7.4 for the parameter values. 


7.5. SOLID- AND SEMISOLID-SCHMIDT CAMERAS 


A common use of the standard Schmidt camera is as the camera in a 
spectrograph. In this application different wavelengths are in focus at different 
places on the focal surface, and it is no longer necessary that the camera be 
strictly achromatic. It is therefore possible to modify the standard air-Schmidt to 
achieve improvements that are otherwise not possible. 

One such modified Schmidt is the so-called solid-Schmidt, one in which the 
space between the corrector and mirror is filled with glass, as shown in Fig. 7.10. 
In this design the aspheric surface is on one end of the glass block and the mirror 
is on the other end. If the length of the block is equal to the radius of curvature of 
the mirror, then third-order off-axis aberrations are zero, just as for an air- 
Schmidt. The condition for zero spherical aberration and minimum chromatic 
aberration is given by Eq. (7.2.14), where ng = n, the index of the glass block. 
Compared to an air-Schmidt, the aspheric figure is n times stronger and the radius 
R, is n times smaller. 

From Fig. 7.10 we see that a chief ray entering the block at angle makes angle 
@/n with the z-axis inside the block. Because this ray is reflected back on itself, 
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Fig. 7.10. Solid-Schmidt camera of index n and effective focal length f /n. The aspheric figure 
and mirror are on opposite ends of the block. 


the height h of the corresponding image is f0/n from the z-axis. Thus the 
effective focal length f” of the solid-Schmidt is f/n, where f is the focal length of 
the equivalent air-Schmidt. 

The reduction in focal length by a factor of n is significant for several reasons. 
First, the focal ratio is reduced by this factor and thus the “speed” of the camera 
is effectively larger by a factor of n?. The term “speed” for a spectrograph is 
defined in Chapter 12; at this point it is sufficient to note that exposure time to a 
given level is inversely proportional to the speed. Second, the off-axis aberrations 
present in an optimized air-Schmidt camera are smaller by a factor of n? in a 
solid-Schmidt. As a consequence, a solid camera will have comparable image 
quality at a field angle that is n times larger than that of an air-Schmidt of the 
same size. Alternatively, a solid-Schmidt will cover the same field as that of an 
air-Schmidt, where the former is n times shorter. 

Given height A = f0/n, we find the variation of h with changing index is given 
by 


dh = —6f dnjn’. (7.5.1) 


If, for example, we take the values of n for SiO, from Table 7.5 at 400 and 
700 nm, and assume f = 500 mm, then dh = 61 um for a field angle of 1°. A 
lateral shift of this amount is not acceptable in direct imagery because a point 
source would be imaged as a short spectrum, with its length proportional to the 
field angle. In a spectrograph camera, on the other hand, each image of the slit is 
quasi-monochromatic and the lateral shift is simply an offset without additional 
blurring. 

The effect of index n on the aberrations is most easily seen from an example. 
Ignore for the moment the aspheric term on the surface of the solid-Schmidt and 
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consider only the radius term. Because the stop is at the surface, the astigmatism 
coefficient is that given in Table 5.1. For s = œo we find 


2 
AE -5 (Sz). (7.5.2) 


2 
n R, 


The corrector for the air-Schmidt, in the absence of the aspheric term, is a plano- 
convex lens of thickness d. The astigmatism coefficient of the lens is the sum of 
the surface coefficients. 

With the convex surface facing the incident light, and the stop at this surface, 
we find that the astigmatism coefficient of the lens for d « R, is given by 


2 
B, = -42) (7.5.3) 


Cc 


In comparing Eqs. (7.5.2) and (7.5.3) it is important to note that Eq. (7.5.2) 
applies to the solid-Schmidt and R, is n times smaller than in Eq. (7.5.3). 
Therefore A, for the solid-Schmidt is n times smaller than B, for the air-Schmidt. 
Substituting each of these coefficients into Eq. (5.6.6) we see that the transverse 
aberration for the solid-Schmidt is smaller by another factor of n. Hence the net 
reduction in the astigmatism due to the radius term on the corrector is smaller by 
a factor of n?, as already stated here. The same factor is found in a comparison of 
the coma coefficients. 

The fabrication of the solid-Schmidt is obviously difficult because the curved 
focal surface lies in the center of the block. To avoid the complication of 
preparing this surface in a hole in the block, an alternative is the so-called 
semisolid- or thick-mirror Schmidt. This camera is one in which glass fills the 
space between the focal surface and the mirror, with a conventional aspheric plate 
in front of the block, as shown in Fig. 7.11. Except for the curved focal surface, 
the face of the block toward the corrector is plane. 

From Fig. 7.11 we see that the location of the corrector is such that the chief 
ray, after refraction at the surface of the block, appears to come from the center of 





Fig. 7.11. Semisolid-Schmidt camera with center of curvature at C. Focal length = f /n. 
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curvature of the mirror. Because the refracted chief ray makes angle 0/n with the 
z-axis, the distance from the axis to the image point is the same as that of the 
solid-Schmidt. Hence the focal length of the thick-mirror Schmidt is the same as 
that of the solid-Schmidt and all of the preceding comments also apply. The 
aspheric figure and radius R, are also the same as those for the solid-Schmidt. 
Ray traces of a solid-Schmidt and thick-mirror Schmidt, with a b’ aspheric 
term added to control fifth-order spherical aberration, show very similar image 
characteristics. For F = 2.5, the focal ratio of the equivalent air-Schmidt, the 
image blur diameters are 1 arc-sec at a field angle of 5°. Compared with the f /2.5 
design example in Section 7.3, the field is about n times larger, as expected. 
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Chapter 8 Catadioptric Telescopes and Cameras 


In this chapter we discuss various derivatives of the Schmidt type of telescope, 
including Schmidt-Cassegrain, Baker-Schmidt, and Bouwers-Maksutov systems. 
Each of these is a type of catadioptric telescope in which a full-aperture 
refracting element provides the aberration correction needed to get good imagery 
over a wide field. Given this definition, the classical Schmidt telescope is also of 
this type. 

The Schmidt-Cassegrain, as the name suggests, is a two-mirror system with an 
aspheric corrector in the collimated beam ahead of the primary mirror. Baker- 
Schmidt systems are a subclass of the Schmidt-Cassegrain with a flat focal 
surface, of which examples of two specific types are given. The Bouwers- 
Maksutov type is one in which the aspheric corrector is replaced by a meniscus 
lens with spherical surfaces. This type of corrector, in combination with one or 
two mirrors, is the basis for a wide variety of wide-field systems. The design 
parameters are given for selected examples of systems using a meniscus corrector. 


8.1. SCHMIDT-CASSEGRAIN TELESCOPES 


The Schmidt-Cassegrain telescope, hereafter designated SC, is a two-mirror 
telescope with a corrector plate in the collimated beam, as shown in Fig. 8.1. 
Compared to an all-reflective Cassegrain, the principal differences are the 
addition of an aspheric plate to compensate for the spherical aberration of the 


185 


186 8. Catadioptric Telescopes and Cameras 





Fig. 8.1. Schematic of Schmidt-Cassegrain telescope with stop at corrector plate. Distance from 
stop to primary = øf}. 


mirrors and the shift of the aperture stop from the primary to the corrector. With 
these changes there are additional free parameters available for the elimination of 
other aberrations, and a host of wide-field SC systems are possible. 

In this section we outline the procedure by which the aberration characteristics 
of a general SC are found. Rather than exploring the features of the general SC, 
however, we choose to apply these results to a selected number of SC types to 
illustrate their basic features. The types considered include the flat-field anastig- 
mat, the SC with spherical mirrors, and the “ short” SC with the corrector 
approximately a distance f} from the primary. For further details on these and 
other types of SC systems, the reader should consult the work by Linfoot (1955). 
A thorough discussion of a variety of Schmidt-Cassegrain systems is also given 
by Wilson (1996). 


8.1.a. GENERAL PARAMETERS 


The notation used in writing the aberration coefficients for each surface is the 
same as that used for two-mirror telescopes. The subscripts 1 and 2 refer to the 
primary and secondary mirrors, respectively, while the subscript c is used for the 
corrector. For a concave primary, the only type considered, the focal length f; is 
positive. 

The relative locations of the mirrors and focal surface of the SC are described 
in terms of the normalized parameters in Table 6.3 used for two-mirror 
telescopes. An additional normalized parameter introduced for the SC is o, the 
location of the aspheric plate relative to the primary in units of the primary focal 
length. According to our sign convention, the distance W, from the primary to the 
stop is negative and we therefore choose to define o = —W,/f, to make o 
positive. 
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Table 8.1 


Aberration Coefficients for SC Primary 


By =a [1-5 +d] 
K, +1 

B — 

31 aR} 


With the stop at the corrector, hence W, = 0, the only nonzero aberration 


coefficient for the corrector is B}, = —b/8. In writing this result we ignore the 
radius term added to minimize chromatic aberration. 
For the primary mirror, the stop is at a distance W, = —of,, the chief ray angle 


yw, is the field angle 0, and the magnification m is zero. Substituting these results 
into the equations in Table 5.6 gives the coefficients for the primary, in the form 
shown in Table 8.1. Note that n = 1 for the primary. 

To find the aberration coefficients for the secondary, we first determine the 
location of the pupil for the secondary using the paraxial relations. As shown in 
Fig. 8.2, the primary images the stop at a distance W = f,o/(1 — a), where W] is 
negative when o > 1. The location of the pupil relative to the secondary is 
W, = Wi +T, where T = (1 —4)f, is the separation between the primary and 
secondary. 

To find the chief ray angle y, for the secondary, we see from Fig. 8.2 that the 
chief ray is directed toward the center of pupil after reflection from the primary. 





Fig. 8.2. Geometry of aperture stop AS, pupil P, and chief ray angles for Cassegrain telescope. 
See Eq. (8.1.1) and preceding discussion. 
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Table 8.2 


Aberration Coefficients for SC Secondary 


_ Ro -1F Fm 2 W,\? 
eE 
_ Wo- 1) Wy m+1 BLA 
Ra (pR,)” (e+ (= “)( m2) 
1 m+1\? 
By = ~ aOR) E + é = i) | 


W, _1+ko—-1) 


pR 2plo-— 1) 











Therefore W Y; = W\w>, where y, is the chief ray angle for the secondary. 
Substituting for y; and Wi gives 
Wz 1+k(o-1) 2W, 
Ve : | ee ee (8.1.1) 
The resulting aberration coefficients for the secondary, taken from Table 5.5, are 
shown in Table 8.2. Note that n = —1 for the secondary and that R, has been 
replaced by pR, in writing these relations. 

The system aberration coefficients are found by applying Eq. (5.6.11) to 
corresponding sets of surface coefficients. After substitution of Eqs. (8.1.1) into 
the coefficients in Table 8.2, and following some straightforward algebra, we get 
the system coefficients for a general SC given in Table 8.3. Also given in Table 
8.3 is the curvature of the median astigmatic surface. The derivation of this 
curvature relation follows from the discussion preceding Eq. (6.2.2). 

Before proceeding to apply these aberration results to selected examples, we 
develop some additional useful relations between the normalized parameters. 
From Section 2.5 we find 7 = BF, = BF /m, where y is the back focal distance in 
units of the telescope diameter D. Using the relations in Table 6.3 we write k in 
terms of p, n, and F, with the result 


KF —kF(2p+1)+ p(F +n) =0 
Solving this relation for k we get 
1 fa 1 m” 
=p+x- zne 12 
k=p+5 (+; n) ; (8.1.2) 


where the minus sign in front of the radical is chosen to ensure that k < 1. From 
Eq. (8.1.2) we see that a specification of p, 4, and F sets the value of k, which in 
turn fixes the values of m and F}. Note that k is independent of F when y = 0. 
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Table 8.3 


Aberration Coefficients for General SC 


g 
B, = aR, — 20,6 +% ) 


0 


By = aR O — %0) 
Q, b 
Pu = aR 8 





2 
w% =4-5 o+ 1-H? + (1 = kK 
k? 
M = 2-5 1(2p -OCP +1 -9 =K = HK] 


k2 
0, =1+K, -z l2- k? - PK] 


2f1—-p\ 1 
n= =) +p Qa? -20+ h) 


It is also important to determine the sizes of the primary and secondary 
required to cover a given field without vignetting. If D is the diameter of the 
aperture stop, then from the geometry in Fig. 8.3 we find that D,, the diameter of 
the primary, is given by 


D, = D(1 + 206F;), (8.1.3) 


where @ is the angular radius of the field. 





Fig. 8.3. Geometry showing primary mirror diameter needed to cover field of angular radius 6 
without vignetting. See Eq. (8.1.3). 
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Fig. 8.4. Geometry of secondary mirror diameter needed to cover field of radius 0 without 
vignetting at outer edge. See Eq. (8.1.4). 


To determine the size of the secondary, we use the geometry in Fig. 8.4. The 
diameter D, , expressed in terms of the diameter of the corrector, is given by 


D, = Dik + 20F(k(o — 1) + I. (8.1.4) 


Note that Eqs. (8.1.3) and (8.1.4) with o = 0 apply to the two-mirror telescopes 
discussed in Chapter 6. Note also that D} and D, can be expressed in terms of y 
and F with the substitution F, = F(p — k)/p. We now apply the results in this 
section to some specific types of SC telescopes. 


8.1.b. FLAT-FIELD ANASTIGMATIC SCHMIDT-CASSEGRAIN 


An anastigmatic optical system, as noted in Section 5.8.c, is one with zero 
astigmatism, coma, and spherical aberration. The condition for a flat field for an 
anastigmat is zero Petzval curvature; the surface of best images is a plane. From 
Eq. (5.7.17) it follows that Petzval curvature is zero when p = 1. Setting p = 1 
also fixes m in terms of k. From Table 6.3 we get m = 1/(1 — k). 

Applying the anastigmatic condition to the aberration coefficients in Table 8.3 
gives 


Q, = 0, Q) = R, bR} = 2Q). (8.1.5) 


The first step in the procedure for solving the relations in Eqs. (8.1.5) to find the 
system parameters is to calculate k from Eq. (8.1.2) for selected y and F. 
Substituting this value of k into Qg = cQ; gives a relation between o and K,, and 
one can be found after the other is specified. With o and K, now known, K, is 
computed using Q; = oQ,. Finally, using the known values of the conic 
constants, b is calculated using the last relation in Eq. (8.1.5). Carrying out the 
first step for F = 3 and selected values of n, we get the results shown in Table 8.4. 
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Table 8.4 


Parameters of f /3 Flat-Field Anastig- 
matic Schmidt-Cassegrain 


Nn k m Fi 


—0.10 0.3672 1.580 1.899 
—0.05 0.3745 1.599 1.876 
0.00 0.3820 1.618 1.854 
0.05 0.3894 1.638 1.832 
0.10 0.3970 1.658 1.809 


Note that 7 <0 places the focal plane between the primary and secondary 
mirrors. 

Using the values of k from Table 8.4 we now find the characteristics of two 
specific systems, first analyzed by Baker (1940). The first is the so-called Baker A 
design with o = 1; the second is the Baker B design for which K, = 0. Baker 
also gave the results for two other flat-field systems; the Baker C design has 
K, =0 and the Baker D design is free of distortion. The parameters of the C 
design are little different from those of the B design, while those of the D design 
lie between those of the Baker A and B systems. For specifics on these other 
versions, see the reference by Linfoot (1955). 

For the Baker A design, the solution of the relations in Eqs. (8.1.5) gives 








201 +k 
K; = 1+2k, K,=-1+ co ) (8.1.6) 
and for the Baker B 
2(1 — k? 4-3 — k? 
oe. yo e (8.1.7) 
4 — k?(3 — k) 2 — k?(2 —k)3 — k) 


Table 8.5 gives the parameters of the Baker A design, including the radius of the 
secondary needed to cover a field 0.1 radians in diameter. Table 8.6 gives the 
results for the Baker B design. 

From the tabulated data in Tables 8.5 and 8.6 it is evident that there is a 
significant difference between the two designs. The mirrors in the Baker A design 
are strongly elliptical, while the primary in the other system differs only slightly 
from a sphere. The diameter of the secondary needed to cover the given field is 
about 15% smaller for the A version, hence there is about 30% less vignetting in 
this design. This difference is a direct consequence of the difference in lengths 
between the two designs, about a factor of two. 

The most significant difference between the two designs is in the size of Q3, 
which is approximately 3 times larger for the A version. As seen from Eqs. 


192 8. Catadioptric Telescopes and Cameras 


Table 8.5 
Parameters of f/3 Baker A Design? 


n K, K, Q D,/D m Q 


—0.10 1.7343 19.284 2.0245 0.557 7.99 
—0.05 1.7491 18.597 2.0125 0.562 8.23 
0.00 1.7639 17.944 2.0000 0.567 8.47 
0.05 1.7789 17.322 1.9870 0.573 8.73 
0.10 1.7940 16.729 1.9735 0.578 9.00 


“Field radius 0 = 0.05 radians. 


Table 8.6 
Parameters of f /3 Baker B Design? 


4 K; 6 Qo D,/D mO, 


—0.10 0.01761 2.1581 0.6582 0.638 2.60 
—0.05 0.01809 2.1644 0.6475 0.644 2.65 
0.00 0.01858 2.1708 0.6366 0.650 2.70 
0.05 0.01906 2.1775 0.6256 0.657 2.75 
0.10 0.01954 2.1843 0.6146 0.663 2.80 


* Field radius 0 = 0.05 radians. 


(8.1.5), this means that the aspheric term b is larger by this factor, as are the 
chromatic effects. This is most easily seen by noting that CSA in Eq. (7.2.6) is 
proportional to b, the aspheric parameter. By definition, CSA is proportional to 
the slope of the surface at the edge of a corrector configured for minimum 
chromatic aberration. Putting Eq. (7.2.4) into Eq. (7.2.1), setting r= rọ, and 
ignoring the term in b’, we find 


dz brè 
dr _ 8’ — n) (8.1.8) 


Hence larger b means larger CSA in direct proportion, and the chromatic 
aberration of the Baker A design is about 3 times that of the B design. 

It is also important to compare the chromatic properties of each SC system 
with that of a standard Schmidt of the same final focal ratio. The ratio of the 
chromatic aberrations is simply the ratio of the corresponding b terms, hence 


CSA(SC) _ 20, R? 
CSA(SS) R 2 





= mh, (8.1.9) 
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where SS denotes a standard Schmidt and m is the magnification of the Schmidt- 
Cassegrain. Values of the relative chromatic aberration from Eq. (8.1.9) are found 
in the rightmost columns in Tables 8.5 and 8.6. It is evident from these results that 
a single element corrector in a Schmidt-Cassegrain as fast as f /3 has chromatic 
aberration that is significantly larger than that of a standard Schmidt, especially 
for the Baker A version. 

Ray traces of the Baker B system, with the addition of a b’ aspheric parameter 
to control fifth-order spherical aberration, show acceptable images to a field 
radius of about 3° at the design wavelength. An acceptable image is defined as 
one for which the blur diameter is no larger than 15 um for an overall focal length 
of 2700 mm. This diameter corresponds to an angular blur of about 1 arcsec for 
this focal length. 

Ray traces of the Baker A design give acceptable on-axis images, as defined 
here, only with the addition of aspheric parameters of still higher order to the 
corrector plate profile. This is not surprising given that the spherical aberration of 
the pair of highly elliptical mirrors is significantly larger than for the Baker B 
mirror pair, especially SA5 and SA7. The field radius for acceptable images is 
less than one-half that of the Baker B design. Thus the A version, in spite of its 
shorter length and smaller vignetting by the secondary compared to the B version, 
is probably not a viable option as a wide-field telescope. 

An analysis of the general solution of the relations in Eqs. (8.1.5) shows that 
the product cQ, decreases slowly as K, decreases. As can be verified from Q; in 
Table 8.3 together with the first of Eqs. (8.1.5), a change in K, from 10 to zero 
gives a decrease in cQ, of roughly 30%. For this same decrease in Ky, the factor 
Q, decreases by a bit over a factor of 2 while o increases by about a factor of 1.6. 
Hence there is a tradeoff between chromatic aberration and vignetting of the 
secondary, with a reasonable balance achieved when the conic constants of the 
mirrors are near zero. 

Compared to a standard Schmidt, the Baker B design has the advantages of a 
flat, focal surface, and a shorter length by about 40%. If these advantages more 
than outweigh the disadvantages of larger chromatic aberration and vignetting by 
the secondary of 40% or more, then this system is a viable alternative to the 
Schmidt. To be competitive with a standard Schmidt over a wide spectral range 
would, however, require an achromatic corrector of the type described in Section 
74, 


8.1.c. SCHMIDT-CASSEGRAIN WITH SPHERICAL MIRRORS 


An alternative to the flat-field anastigmatic SC is the family in which both 
mirrors are spherical and the focal surface is curved. The analysis of this type of 
SC proceeds in a way very similar to that in the last section. Putting K, = K, = 0 
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in the relations in Table 8.3, the system aberration factors take the following 
simplified form 


P% = 4p? —P2Qp+1-hy, 
PO = 2p? — k2p — k2p + 1 — k), (8.1.10) 
P% =P —PQp—ky. 


If we require that the system be anastigmatic, then the relations in Eq. (8.1.5) 
apply. The only solution from these relations is 2p = 1 + k. Substituting for p in 
Eqs. (8.1.10) gives Q) = 2Q, = 4Q), hence o = 2 by Eqs. (8.1.5). Writing p in 
terms of R, and R,, we find the relation R, — R, = (1 — k)fi. Hence the two 
mirrors have a common center of curvature with the vertex of the corrector at this 
common point. This is the so-called concentric Schmidt-Cassegrain. Because the 
mirrors are concentric, so also is the Petzval surface, the focal surface when the 
astigmatism is zero. 
Substituting 2p = 1 + k into Eq. (8.1.2) we find 


p 1+n/F 
Soe: 


The magnification of the secondary is m = (1 + k)/(1 — k) and the Petzval 
curvature from Eq. (5.7.17) is 


2 (1-k l n 
V= (=F (1 2). (8.1.12) 
Parameters for several concentric SCs are given in Table 8.7. Comparison with 
the parameters for the Baker B systems in Table 8.6 shows that the chromatic 
spherical aberration of the concentric SC is approximately 2 times larger. Thus 
the simplification of having only spherical mirrors is offset by larger chromatic 
aberration and a curved focal surface. 

Another system with only spherical mirrors is the aplanatic spherical Schmidt- 
Cassegrain, a system in which spherical aberration and coma are zero but 
astigmatism is not. In this case only the first and last of the relations in Eqs. 
(8.1.5) hold. With astigmatism not equal to zero, we choose to set x,,, the 
curvature of the median image surface to zero. Combining the relations for B,, 
and x,, in Table 8.3, and setting K„ = 0, gives 


Ë -p 
B, = —-— | — Ld 
2 am ( P ) oe 


(8.1.11) 





hence 


Ë fl— 
AAS = 2B,.y = (=£) (8.1.14) 
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Table 8.7 
Parameters of f/3 Schmidt-Cassegrains with Spherical Mirrors 


Concentric Design (o = 2) 
n k p m Q, w 





—0.10 0.3187 0.6593 1.936 0.6457 4.69 
0.00 0.3333 0.6667 2.000 0.6250 5.00 
0.10 0.3483 0.6742 2.069 0.6041 5.35 


Aplanatic Design (p = 0.95) 


n k o m Q, mQ, 
—0.10 0.3618 2.198 1.615 0.6388 2.69 
0.00 0.3765 2.214 1.657 0.6163 2.80 


0.10 0.3913 2.231 1.700 0.5943 2.92 


It is evident from Eq. (8.1.14) that it is necessary to have ( near one to keep the 
astigmatism small. If, for example, we choose p = 0.95, m = 1.7, and F = 3, 
then AAS is approximately 1 arc-sec when 0 = 1.5°. 

Table 8.7 gives the calculated parameters for several aplanatic SCs with 
spherical mirrors. Note that chromatic aberration, the main discriminant between 
the two spherical mirror designs, is significantly smaller for the aplanat. We also 
see that CSA for the aplanat and the Baker B design are comparable, a result that 
is not surprising given mirrors that are similar. 

Ray traces of the aplanat in Table 8.6, with the addition of a b’ parameter to 
control fifth-order spherical aberration, show that astigmatism limits the field 
diameter to about 3°, as compared with roughly twice this value for the Baker B 
design. Thus the Baker B system, with its substantially larger field, has a clear 
edge over the aplanat. 


8.1.4. COMPACT SCHMIDT-CASSEGRAIN WITH SPHERICAL PRIMARY 


We have limited our discussion of the Schmidt-Cassegrain in the preceding 
sections to those designs that are possible alternatives to the standard Schmidt, 
that is, designs with wide field and relatively fast focal ratios. If these conditions 
are changed to smaller field, on the order of 1° in diameter, and Cassegrain focal 
ratios ~10, then a family of aplanatic SC designs is found with o ~ 1 and 
tolerable astigmatism. Although various combinations of K, and K, are possible, 
the usual choice is a spherical primary. With this choice the secondary is 
ellipsoidal and m ~ 5. Small telescopes of this type are available from several 
manufacturers and are popular choices among amateur astronomers. 
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The characteristics of “short” aplanatic SC telescopes of this type are found 
by applying the general theory given in the preceding. In this section we outline 
the approach and give results for a typical set of parameters. We consider only the 
case where the primary is spherical, hence K, = 0. 

The aplanatic condition is zero coma and spherical aberration, hence B,, and 
B,, in Table 8.3 are set to zero. From the zero coma condition we find 


K[l+k(o — 1) = 7) +E) re- (= *7)|, (8.1.15) 


where p has been expressed in terms of m and k using the relations in Table 6.3. 
Putting the zero coma condition into the first equation in Table 8.3 we find 


@ | (i+k 2 (m—1\? m+1 
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(8.1.16) 





There are an infinite number of possible combinations, but we are only interested 
in “short” versions and choose o = 1. Equations (8.1.15) and (8.1.16) are now 
much simpler, but analysis of this special case will illustrate the general 
characteristics of this type of telescope. 

Substituting o = | and expressing k in terms of m and ß from Table 6.3 gives 


@ /m+1 1+8 m—1\? 

Bis = aR EDE 2( = ) | (8.1.17) 
where B is the normalized back focal distance. As an example let m = 5 and 
p = 0.2. From Eq. (8.1.15) we get K, = —0.4531 (K, is independent of £) and 
from Eq. (8.1.17) we get B}, = —0.40(67/4R,). Inserting B,, into Eq. (5.6.6) and 
dividing by s’ to get the angular aberration gives AAS = 0.39 arc-sec for 
0 =0.5° and F; =2. Thus, for the chosen parameters, the astigmatism is 
undetectable over a 1° field diameter with a ground-based telescope for practi- 
cally all atmospheric conditions. It is worth noting that a choice of m = 4.5, with 
the other parameters unchanged, gives astigmatism that is about 10 times smaller 
at the same field angle. An increase in the field diameter to 2° is possible in this 
case. 

For configurations with negligible astigmatism the field curvature is essentially 
Petzval curvature only. For m=4.5 and B=0.2 we find k=0.218 and 
p =0.281. Thus the Petzval curvature is 5.12/R, and the focal surface is 
rather strongly curved. As Wilson notes, this curvature is of little consequence 
for visual use but would require a field flattening lens for photography to get the 
best image definition over a wide field. 
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It is left as an exercise for the reader to determine the chromatic properties of 
the corrector plate for a short SC. The result for m = 4.5 is Q, = 0.80 and 
mO, = 73. Although this latter value is large compared to those in Tables 8.6 or 
8.7, it should be noted that for the short SC the comparison is with a standard 
Schmidt whose focal ratio is of order 10, not 3 as in the tables cited. If, instead, 
we compare an f /9 SC ( f /2 primary mirror for m = 4.5) with an f /3 Schmidt of 
the same diameter, then the ratio from Eq. (8.1.9) is 2.7. Thus the chromatic 
aberration due to a single corrector plate is tolerable in a short SC with m œ% 5. 


8.1.e. CONCLUDING REMARKS 


The SC designs discussed in the preceding sections are either anastigmats or 
aplanats with the stop located at the corrector plate. Because both spherical 
aberration and coma are zero in all of these designs, it follows from the discussion 
in Section 5.5 that both coma and astigmatism are independent of the stop 
position. Thus all of the results given, except the chromatic aberrations, are valid 
for an arbitrary stop position. If the stop is displaced from the corrector, the 
chromatic effects increase by the factor T?, as described in Section 7.2. Given the 
already large chromatic effects in the SC compared to those of the standard 
Schmidt, it is evident that an SC with a stop displaced from the corrector is of 
limited usefulness. 

Of all of the designs considered in this section, only the Baker B version can 
be considered a contender with a standard Schmidt in the 1-m class, and then only 
if the corrector plate is achromatic. The Baker B design has a flat field, a factor in 
its favor, but larger vignetting because of its larger central obscuration, a factor 
against it. 

The short Schmidt-Cassegrain design, so popular with amateur astronomers, is 
really a competitor with two-mirror Cassegrain telescopes of the type discussed in 
Chapter 6 and not with Schmidt telescopes. With an accessible focal surface (a 
requirement for visual and photographic use) and excellent image quality (an 
aplanat with negligible astigmatism if properly designed), the compact Schmidt- 
Cassegrain is often the telescope of choice in apertures of 0.4m or smaller. 


8.2. CAMERAS WITH MENISCUS CORRECTORS 


We now turn our attention to another type of wide-field camera, one in which 
the aspheric corrector is replaced by a meniscus lens. The purpose of the 
meniscus is the same as that of the corrector, to compensate for the spherical 
aberration of the following mirror(s). The theory of the meniscus corrector was 
developed independently by Bouwers (1946), Maksutov (1944), and Baker 
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(1940) with their names attached to various versions of meniscus cameras. In this 
section we consider a subset of the many types of meniscus cameras that have 
been described in the literature. The reader should consult the references at the 
end of the chapter, including the monograph by Maxwell (1972), for details on 
these and other designs. Another excellent discussion of cameras of this type is 
given by Wilson (1996). 


8.2.4. CONCENTRIC MENISCUS CORRECTOR 


A type of meniscus lens is one in which the two surfaces of the lens are 
concentric with the surface of a spherical mirror, as shown in Fig. 8.5. If an 
aperture stop is placed at the common center of curvature, as in a standard 
Schmidt, then the system has no unique axis and all off-axis aberrations are zero. 
The Petzval surface is also concentric with the other surfaces and the image 
surface is curved, as in a standard Schmidt. The characteristics of the images are 
determined entirely by the spherical aberration and any chromatic aberration 
introduced by the meniscus. 

The complete analysis of the spherical aberration of the system shown in Fig. 
8.5 involves the application of Eq. (5.6.7) with j = 3 together with the corre- 
sponding coefficients from Tables 5.5 and 5.6. The result of this exercise, with all 
surfaces concentric, is a cubic equation involving the thickness and location of 
the meniscus. Although the solutions of this equation gives results in good 
agreement with those derived from ray traces, the form of the equation is quite 
complicated and gives little insight into the workings of the meniscus lens. It is 
more instructive to follow the approach by Bouwers and we choose to use his 
method. 

The starting point in the Bouwers method is the assumption that the spherical 
aberration coefficient of the meniscus is that of a thin lens for which the source is 
at infinity. Although the derivation of this result is straightforward using the 


Fig. 8.5. Bouwers concentric camera with meniscus corrector. All surfaces are spherical with 
common center of curvature C. 
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results in Chapter 5, we take the expression given by Bouwers and convert it into 
the desired coefficient. The result is 


1 n \2 f (2n+1 fN ee? 
P: REE (= =e cam ` 8.2.1 
3! TES £(=**)+(Z) n ( ) 
where f is the focal length of the lens and R; is the radius of curvature of its first 
surface. For a concentric lens, as we show in what follows, f >> R, and to a good 


approximation 
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The focal length of a thick lens is found by substituting P,; and P, in Eq. (2.4.1) 
into Eq. (2.4.3), with the result 


1 1 1 d(n—1) 
fue WE-%) +5 RR (8.2.3) 


The condition for a concentric lens is d = R, — Ry, where d > 0 and the radii are 
negative according to the sign convention. Rewriting Eq. (8.2.3) in terms of d we 
find that the focal length of a concentric lens is given by 


1 d n—1 
77 -ER (>): oe 


Practical values of d are 10 or more times smaller than R,, hence f is typically 30 
times or more larger than R,. Thus we are justified in taking Eq. (8.2.2) for the 
spherical aberration coefficient of the lens. 

Note that the concentric meniscus lens has a large negative focal length and is 
therefore a weak diverging lens. Thus the lens is thicker at the margin than at the 
center, the same as that of an aspheric corrector without an added radius term, and 
the signs of the spherical aberration coefficients of the lens and aspheric plate are 
the same. 

To find the system spherical aberration coefficient we simply add Eq. (8.2.2) to 
that of a spherical mirror in collimated light from Table 5.2. The result, after 
substitution of Eq. (8.2.4), is 
_(n-1\n+2) d 1 
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where R is the radius of curvature of the mirror. Note that by adding the 
coefficients to get Eq. (8.2.5), we have ignored the divergence of the beam from 
the lens and taken the same ray heights at the mirror and lens. This is similar to 
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the procedure followed for the Schmidt telescope and is acceptable here in view 
of the other approximations made. 
At this point we express d, R}, and R, in terms of R as follows: 


d = —CR, R; Fo XR, R, z (x + OR, 


where y and ¢ are positive. Setting Eq. (8.2.5) equal to zero, substituting in terms 
of R, and solving for ¢, we find 


ti 
f= [ee - J (8.2.6) 
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Taking n = 1.46, values of ¢ for a selected set of y values are found in Table 8.8. 
Note that ¢, the normalized thickness of the meniscus, increases rapidly as the 
lens is placed farther from the stop. 

The values of y and ¢ in Table 8.8 serve as the starting point for ray-trace 
analysis of the meniscus camera. Results from ray traces of a series of f/3 
systems with these nominal parameters are given in Table 8.8. In view of the 
approximations made in this approach by Bouwers, it is not surprising that the 
image quality is unacceptable for the calculated combinations of y and <. 
Analysis of these images shows the presence of both residual third-order 
spherical aberration and a significant amount of fifth-order spherical aberration. 

Acceptable image quality is achieved by holding y constant and adjusting ¢ to 
give the monochromatic image diameter its smallest possible value. The results 
found from this analysis for f/3 systems are shown in Table 8.9. Note that for 
larger y the values of ¢ derived by this procedure are significantly larger than 
those from Eq. (8.2.6). By changing ¢ at a given x, third-order spherical 
aberration of an amount approximately equal in magnitude but opposite in sign 
to that of the fifth-order contribution for rays at the margin can be introduced. The 


Table 8.8 


Nominal Parameters for Concentric Meniscus Lens* 


X ig Blur’ 
0.200 0.00438 3.2 
0.225 0.00708 4.5 
0.250 0.01092 8.7 
0.275 0.01622 11.7 
0.300 0.02339 15.4 


f Values of y and ¢ derived from Eq. (8.2.5) with 
n= 1.46. 

P Image diameter at best focus, in units of arc- 
seconds, for f/3 systems. 
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Table 8.9 


Parameters for Optimized Concentric Meniscus Cameras* 


x 4 JAR BFD/R Blur’ 
0.200 0.0043 0.492 0.508 2.8 
0.225 0.0076 0.489 0.511 2.0 
0.250 0.0128 0.485 0.515 1.6 
0.275 0.0209 0.481 0.519 1.2 
0.300 0.0338 0.475 0.525 1.0 


* Results derived from ray traces with ¢ adjusted to minimize 
image blur diameter; f. = camera focal length; BFD = dis- 
distance from mirror to focal surface. 

? Image diameter at best focus, in units of arc-seconds, for 
f /3 systems. 


result is an image diameter that is significantly smaller. Hence a significant 
improvement is achieved by balancing the spherical aberration contributions. 

Note also that the camera focal lengths decrease and the back focal lengths 
increase as the lens thickness increases. This is a consequence of the changing 
focal length of the concentric lens, as is evident from examination of Eq. (8.2.4). 

Although the monochromatic image size is acceptable for a meniscus camera 
with a thick lens, such as for y% 0.3, the polychromatic image size is 
unacceptable. This is a consequence of the change in focal length of the lens 
with changing wavelength, or longitudinal chromatic aberration. From Eq. 
(8.2.4) we find that the focal length of the lens changes with index according 
to the relation 


df /f = —dn/n(n — 1). (8.2.7) 


Because the rays incident on the mirror appear to come from the focal point of the 

lens, a shift of this point translates into a shift of the camera focal point. Denoting 

the camera focal length by f. and applying Eq. (2.5.5) we find df. = —m? df, 

where m is approximately —/./f, the magnification due to the mirror. 
Combining these results with Eqs. (8.2.4) and (8.2.7) we find 


Mew 5 an (8.2.8) 


f. ne’ 
where df. is the axial shift of focus with changing index. For the balanced system 
with y = 0.275, € = 0.0209, dn = 0.0018, and n = 1.46, Eq. (8.2.8) gives 
af./f, = 9.000117. The diameter of the image over the range of wavelengths 
spanned by this change of index (510 to 590nm for an SiO, lens) is nearly 9 
arc-sec, a significant increase over the monochromatic diameter of 1.2 arc-sec. 
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Thus the concentric meniscus corrector lens is not a viable option because of 
longitudinal chromatic aberration. 

There are two methods for reducing the chromatic aberration of the meniscus. 
One, proposed by Bouwers, is an achromatic meniscus made of two different 
glasses cemented together. In this case the cemented interface cannot be made 
concentric with the outer surfaces, and the system is no longer strictly concentric. 
If, however, the two glasses have the same index of refraction but different Abbe 
numbers, then the cemented lens is still very nearly concentric. This possibility is 
discussed by Wilson (1996). For further details the reader should also consult the 
references by Bouwers (1946) and Maxwell (1972). 

A second method, first proposed by Maksutov (1944), is an achromatic 
meniscus corrector made of a single glass with f invariant to a change in 
index. To achieve this condition, however, it is necessary to depart from the 
concentric lens surfaces. We examine briefly the characteristics of this type of 
corrector in the next section. 


8.2.b. MAKSUTOV ACHROMATIC CORRECTOR 


The achromatic corrector proposed by Maksutov is one in which the focal 
length is invariant to a change in index. This condition is easily derived by taking 
f for a thick lens and setting df /dn = 0. Applying this condition to Eq. (8.2.3) we 
find 





n2 
n -l 

Relative to a concentric lens, we see from Eq. (8.2.9) that this lens is roughly 2 

times thicker. Using Eq. (8.2.9) we can find the separation Az between the centers 

of curvature of the surfaces of the meniscus, with the result 


Az = (R, — R,) —d =—d/n’, (8.2.10) 


where the minus sign indicates that the center of curvature of surface 2 is closer to 
the mirror than that of surface 1. It is evident from Eq. (8.2.10) that the surfaces 
of the meniscus are more nearly concentric for small d. 

We are not going to discuss all of the details in the design of a Maksutov 
camera, but instead will illustrate the general characteristics with one example. 
For a mirror of radius of curvature R we take d = —0.02473 R as a constant and 
vary R,,R>, and the lens-mirror separation until spherical aberration of the 
marginal rays is balanced. The value of R}, of course, is tied to that of R, by 
Eq. (8.2.9). The axial position of the stop is then altered until coma is balanced, 
with the parameters of the final system shown in Table 8.10. Note that the stop is 
near the first surface of the meniscus lens. 
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Table 8.10 
Parameters of a Maksutov Achromatic Camera? 


Distance from Mirror 





Stop 0.5947|R| 
Surface 1 of lens 0.5914|R| 
Surface 2 of lens 0.5667|R| 
Image 0.5129 R 


“Index n used in Eq. (8.2.8) is 1.46, d= 
0.0248|R|, R, =0.2087R, R, = 0.2219 R. 
Radius of curvature of image surface = 0.538 R. 


Ray traces for an f /3 system with parameters given in Table 8.10 show an on- 
axis image whose angular diameter is about 3 arc-sec, and a slow increase in 
image size to 5 arc-sec at a field radius of 2°. This on-axis image diameter is 
about 10 um for a camera whose focal length is 750 mm. The chromatic effects of 
the corrector are much smaller than those of the concentric meniscus, as expected, 
with df. some thirty times smaller than that given by Eq. (8.2.7). For an SiO, 
corrector the on-axis image diameters on a fixed focal surface do not exceed 4 
arc-sec over the wavelength range from 400-700 nm. 

Although this system was not given a detailed optimization, it is evident that 
its image quality is quite satisfactory, provided the camera is not too large. 
Compared to the concentric meniscus camera, the Maksutov achromatic camera 
is clearly superior in its chromatic characteristics. 


8.2.c. CONCLUDING REMARKS 


The discussion of meniscus lens cameras in the preceding sections is only an 
introduction to cameras based on this type of corrector. Among other types are 
those in which the meniscus is split, with part of it preceding the aperture stop. In 
addition, the meniscus on either side of the stop may itself be split into two 
separate pieces of glass. There are also so-called hybrid systems in which an 
aspheric plate located at the stop is used in conjunction with a meniscus corrector 
and others in which an aspheric surface is put on one of the surfaces of the 
meniscus, 

A widely used hybrid system is the Super-Schmidt or Baker-Nunn camera 
used for wide-field photography to record trails of meteors and artificial satellites. 
This type of camera has a double concentric meniscus, half on either side of the 
stop, with a doublet Schmidt plate at the stop. The light reflected from the mirror 
passes through the meniscus nearer the mirror a second time before coming to the 
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Fig. 8.6. Maksutov Cassegrain telescope with achromatic meniscus corrector. The secondary 
mirror shown is an aluminized circular area on the back face of the corrector. 


curved focal surface. Because the design is based on the concentric principle, an 
angular field diameter of about 50° with good imagery is achieved. 

A meniscus lens can also be used in place of an aspheric corrector in a 
Cassegrain camera, as shown schematically in Fig. 8.6. The secondary mirror in 
this type of system can be a separate mirror or a centered reflecting area on the 
back surface of the meniscus lens. 

For more details on these and other meniscus lens systems, the reader should 
consult the references at the end of the chapter, with those by Bouwers (1946) and 
Maxwell (1972) a good introduction. 


8.3. ALL-REFLECTING WIDE-FIELD SYSTEMS 


The discussion in this chapter and the preceding one is intended to show the 
basic characteristics of so-called wide-field cameras and telescopes. It is evident 
from this discussion that there are many designs capable of good imagery over 
fields of several degrees, with chromatic effects generally setting the limit to the 
range of wavelengths that can be covered. 

The principles used for these catadioptric designs also apply to all-reflecting 
wide-field systems. One obvious use of such a system is a space-based ultraviolet 
telescope. In these systems a reflecting aspheric corrector replaces the refracting 
plate and chromatic effects are absent. To separate the incident beam from the 
beam reflected from the corrector, the corrector axis is tilted by angle @ relative to 
the mirror axis. The angle between the incident and reflected chief ray at the 
corrector is then 20. 

The main complication of a tilted corrector is that its surface figure must be 
modified so that a collimated beam from the center of the field “sees” a circular 
profile on the corrector. This is achieved by placing an elliptical figure on the 
corrector with 7? in Eq. (7.2.1) given by r? =x cos? 0 +y, instead of 
r =x +y. The other change required is to replace z in Eq. (7.2.1) by 
zcos 0; this ensures that the OPD introduced by the tilted corrector is the same 
as that of an untilted plate. An aspheric plate with an elliptical figure is obviously 
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more difficult to make than one with a circular figure, and no large systems of this 
type have been made. We do not discuss this type of system in detail here; the 
interested reader should consult the paper by Schroeder (1978). 
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Chapter 9 Auxiliary Optics for Telescopes 


Telescopes are often used for direct imaging without extra optics in the light 
beam, but there are many observations for which auxiliary optics are required. 
Examples of some types of observations that require additional optics are 
spectroscopy and photometry. In the case of photometry this is often no more 
than a field lens to reimage the exit pupil of the telescope onto a detector. For 
spectroscopy the extra optics may be as simple as a prism or diffraction grating 
placed in the light beam, or a separate spectrograph with many optical elements 
whose entrance aperture is at one of the telescope foci. The characteristics of 
spectrometers are discussed in Chapters 14 and 15. 

Even for direct imaging, it may be important to enhance the characteristics of 
an existing telescope by adding optical elements to widen the field, flatten the 
image surface, compensate for atmospheric refraction, or reimage at a different 
focal ratio. For a new telescope it is now customary to include such optics in the 
early design stages and often to design the telescope with its auxiliary optics as a 
system in itself. 

The kinds of auxiliary optics discussed in this chapter include field lenses and 
field flatteners, corrector systems for both prime and Cassegrain focus, focal 
reducers for Cassegrain telescopes, and atmospheric dispersion correctors. There 
is also a brief discussion of elements used in fiber optics. Attention is given to the 
aberration characteristics of these systems, with examples given for each type of 
system considered. 
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9.1. FIELD LENSES, FLATTENERS 


A field lens is an element that is placed at or near an image plane in an optical 
system. One application for such a lens, as noted in Section 5.7, is that of 
flattening a curved image surface. Before discussing this application, and others, 
we consider the aberrations introduced when an object or image lies close to an 
optical surface. 


9.1.a. ABERRATIONS 


The aberration coefficients for a general surface are given in Table 5.5. For an 
object close to the surface, take I from Table 5.5, replace s’ using Eq. (2.2.4), and 
let s < R. The result obtained is 


2: 
r=2(5-1). (9.1.1) 
S\n 


Given the condition that s is small, the dominant terms in the coefficients in Table 
5.5 are those containing the factor I. Taking only the dominant terms, putting the 
coefficients into Eq. (5.5.9), and letting s’/n’ = s/n, gives 


sasia me-t) 


Wr n? 
ras=- (1-7) A A= 3-1. 


(9.1.2) 


Because y/s is finite for all s, each of the transverse aberrations in Eq. (9.1.2) 
goes to zero as s approaches zero and the image is free of aberrations in this limit. 
This result is not surprising because s’ > 0 as s > 0, and the image and object 
coincide when s = 0. 

For a real lens placed at an image surface, s cannot be zero for both surfaces, 
but it is small enough for each surface so that its contributions are usually of little 
consequence. 


9.1.b. FIELD-FLATTENED RITCHEY-CHRETIEN TELESCOPE 


As an example, we consider a lens placed near the Cassegrain focus of a 
Ritchey-Chretien telescope, as shown in Fig. 9.1. The lens parameters are chosen 
so that the median astigmatic surface of the telescope-lens combination is flat, 
thus 


K,,(RC) + x,,(lens) = 0. (9.1.3) 
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FF 


Fig. 9.1. Ritchey-Chretien telescope with field flattener lens FF. The lens parameters are given in 
Eq. (9.1.4). 


The relation for the first term in Eq. (9.1.3) is given in Table 6.9. We find the 
second term by noting that a thin lens with s ~ 0 has negligible aberrations 
according to Eq. (9.1.2), hence k,, = kp. 

The Petzval curvature of a lens is derived from the relation in Table 5.7, with 
the choice R) = œ so that the distance between the flat image and lens can be 
made as small as desired. Substituting the derived result into Eq. (9.1.3) gives 


(n — 1)/nR, = K»(RC), (9.1.4) 


where R, is the radius of curvature of the lens surface facing the secondary. 
Because x,, for a Ritchey-Chretien is negative (the image surface is concave as 
seen from the secondary), the lens has R, < 0 and is plano-concave in cross 
section, as shown in Fig. 9.1. 

Using the parameters of the RC telescope in Table 6.10 and letting 
R; = —6000 gives R; = —248 for n = 1.46, the index of SiO, at a wavelength 
of 548 nm. The aberrations of the telescope with and without the field flattener 
lens are given in Table 9.1, with the results taken from a computer ray-trace 
program. 

Note that the field lens does change the system aberrations, but only slightly. 
Because the lens has zero astigmatism, the assumption that x,, = x, for the lens 
is not quite true, and the image surface curvature is not zero. However, this 
assumption is a good first approximation, and a change of R, to —260 gives a flat 
image surface. 


9.1.c. FIELD-FLATTENED SCHMIDT CAMERA 


As a second example we consider a field-flattener lens placed near the curved 
focal surface of a Schmidt camera. In Section 5.7 we derived the condition for a 
flat Petzval surface for the combination of a spherical mirror and thin lens, with 
Eq. (5.7.16) giving the required condition for a plano-convex lens. The spherical 
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Table 9.1 


Aberrations for Field-Flattened Ritchey-Chretien Telescope”? 


With lens 





Without lens R, = —248 R, = —260 








ASA 0.000 0.002 0.002 
ATC 0.000 0.039 0.037 
AAS 1.025 0.913 0.924 
Kp Ry 7.625 —0.393 0.003 
f 12000 12101 12096 


° Lens: plano-concave shape; thickness at vertex = 6; back focal 
distance: = 0.43. Telescope: R,; = —6000. Other parameters in 
Table 6.10. 

Ż Aberrations are given at a field angle of 18 arc-min in units of 
arc-seconds. 


aberration of the mirror alone is canceled by an aspheric plate with b chosen 
according to Eq. (7.1.5) with m = 0. 

Choosing R = —1000 and R, = œ gives b = —2.E-9 and R, = —157.5 for 
n = 1.46. The aberrations of the camera with and without the flattener lens are 
shown in Table 9.2, with F = 2.5 for the camera without the lens. Note that the 
lens flattens the field, but introduces significant aberrations, especially coma and 
spherical aberration. 


Table 9.2 


Aberrations for Field-Flattened Schmidt Camera*? 


With lens 


Without lens W/R=1.0 W/R = 0.979 





ASA 0.002 2.807 2.807 
ATC 0.000 2.247 0.010 
AAS 0.000 0.208 0.217 
K,R 2.000 —0.0004 —0.0004 
f 500.0 494.2 494.2 
“Lens: plano-concave shape; R; = —157.5; thickness at 


vertex = 5; back focal distance = —0.53. Mirror: 
R = —1000. Corrector: b = —2E-9, R, = œ, thickness at 
vertex = 10. 

? Aberrations are given at a field angle of 1° in units of arc- 
seconds. 
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The coma due to the field fiattener lens can be largely removed by reducing the 
corrector-mirror separation by about 2%, but this displacement does not, of 
course, affect the spherical aberration. For this example it turns out that higher- 
order aberrations are not negligible. Fifth-order spherical aberration compensates 
in part for third order and the result from ray tracing is an on-axis blur diameter of 
about 1.4 arc-sec. This blur can be reduced to a negligible value by adding an 
aspheric term of higher order to the corrector, as noted in Section 7.2. 

Comparing the effects of the flattener lens in these two examples, it is evident 
that the aberrations it introduces are significantly larger for the Schmidt camera. 
For spherical aberration this is entirely a consequence of the different focal ratios, 
F = 2.5 for the camera and F = 10 for the RC telescope, where y/s = 1/2F in 
Eq. (9.1.2). For astigmatism and coma the different pupil position for the lens is 
also a factor. For the RC telescope the pupil location is given by Eq. (2.6.1), from 
which we find W/R, ~ 15. For the Schmidt camera the mirror images the 
aperture stop back on itself and W/R, % 3. Substitution of these results into the 
relations in Eq. (9.1.2) accounts for the relative sizes of the aberrations introduced 
by the field lens. 

In two other applications of a field lens its primary purpose is to reimage the 
exit pupil of the telescope. For a photometer an aperture at the telescope focus 
passes the light of a single star and a field lens at the aperture images the 
telescope exit pupil on the photosensitive surface of a detector. If the star should 
wander in the aperture because of atmospheric effects, the effect is not seen by the 
detector because the reimaged exit pupil does not wander on its surface. Such a 
lens is often called a Fabry lens. When the instrument on a telescope is a 
spectrograph a lens is often placed at the entrance aperture so that the lens, in 
combination with the spectrometer collimator, reimages the exit pupil onto the 
grating or prism that follows the collimator in the spectrometer. 


9.2, PRIME FOCUS CORRECTORS 


A large Ritchey-Chretien telescope is generally equipped with interchangeable 
secondaries to provide a range of focal ratios, as noted in Section 6.2. The focal 
ratio at the Cassegrain focus is usually the smallest, typically 6 to 8, which for a 
4-m telescope gives an image scale of about 7 arc-sec/mm. With this scale the 
typical blur diameter of a star image is often not well matched to the size of a 
detector resolution element, usually an individual pixel in a solid state detector. A 
better match between image and pixel size is achieved if the focal ratio is smaller, 
which in a Cassegrain configuration means a larger secondary and more 
obscuration. An alternative approach to getting a smaller focal ratio is to use 
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the primary mirror only, which, in combination with a corrector system, can 
provide a usable field at a focal ratio of 2 or 3. 


9.2.4. ASPHERIC PLATES 


The simplest prime focus corrector system is a single aspheric plate in the 
converging beam near the image surface, as shown in Fig. 9.2. The aperture stop 
is the primary mirror and the plate is distance g from the focus, hence W = f — g 
for the plate. The aberration coefficients for the primary are taken from Table 5.2, 
with m = 0, and the coefficients for the corrector are taken from Table 5.5, using 
only the terms in b. Substituting these results into Eq. (5.6.7) and choosing 


Yı = Yı, we get 





2 2 
TR. [j2 £l 





2f f 
0 2b(f — g)g? 
By, =a |: he gg | (9.2.1) 
_ 1 [K+1. bg 
B=- GF} 


where w = —0 and y,/y, = g/f have been substituted. For a given K there are 
two free parameters, b and g, in Eqs. (9.2.1) and two of the coefficients can be set 
to zero. The dominant aberrations at small field angle, for any primary other than 
a paraboloid, are spherical aberration and coma. Setting these coefficients to zero 
gives 





_ (K+if g (K+! 
b= £- (4). (9.2.2) 


paana 


Fig. 9.2. Aspheric plate prime-focus corrector for hyberboloidal primary at distance g from the 
focal surface. 
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It is evident from Eqs. (9.2.2) that the location of the plate is set by the conic 
constant of the primary and this, in turn, sets the value of b. Because g and f are 
each positive, so is b for K < —1. Note also that the condition g > 0 means that 
correction of both spherical aberration and coma with a single plate is not 
possible for an ellipsoidal primary. 

Taking b from Eq. (9.2.2) and substituting into B,, we find 


?/ K 
Comparing the astigmatism given by Eq. (9.2.3) with that of the primary only, we 
see that, depending on the value of K, the radius of the usable field is limited to a 
few arc-minutes. From Eq. (9.2.3) we also see that the larger is K for the 
hyperboloid in absolute terms, the smaller is the astigmatism at a given field angle 
and the larger is the usable field. A larger difference also means a greater distance 
between the plate and the focus, and a larger plate size, as seen from Eq. (9.2.2). 
As pointed out by Gascoigne (1973), these conclusions also hold for more 
complex corrector systems. 

The final parameter of interest for this system is the curvature of the median 
image surface. Following the procedure in Section 6.2 we find 


k? B,,(cor) = B,,(pri), 0=k0', 


where k = g/f. Using these relations we get 


I/K=-1\ 1 
Km = ($a) | (9.2.4) 


hence the focal surface of best images is strongly curved, and is concave as seen 
from the primary. 

An example of these results applied to an f/3 Ritchey-Chretien primary is 
shown in Table 9.3. The conic constant chosen is that for an RC telescope with 
m= 2.5 and B = 0.25 at the Cassegrain focus. The plate has a diameter-to- 
thickness ratio of 25 and its radius of curvature is chosen to give minimum 
chromatic effect. The radius of the plate rg is chosen to cover a field radius of 
about 0.12°. 

From the results in Table 9.3 we see that the coma and spherical aberration of 
the primary have been largely but not entirely eliminated. The size of the 
residuals depends on plate thickness and orientation. Although ASA for the 
example in Table 9.3 can be reduced to zero either by moving the plate closer to 
the primary or adjusting the value of b, ray-trace results show that spherical 
aberration and coma of higher order are not negligible. These aberrations are 
reduced to negligible levels by including a fifth-order aspheric coefficient b’ and 
adjusting the aspheric parameters and plate position, details that are omitted here. 
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Table 9.3 


Characteristics of a Prime Focus Corrector”? 


With plate 
Without plate theory actual“ 
ASA 21.221 0.000 0.850 
ATC 7.500 0.000 0.057 
AAS 0.105 0.696 0.707 
Kat 1.000 12.25 12.20 


° Primary: diameter = 4.0m; f = 12.0 m; K = —1.17778. 
Plate: b= 5.792E-10; R= —52800 mm; rp = 200 mm, 
n = 1.46; thickness = 16mm; g = 0.08163f; W = 11, 000 
mm. 

? Aberrations are given at a field angle of 0.1° in units of arc- 
seconds. 

€ Values in “actual” columns are from ray-trace program. 


This example illustrates a general procedure in the design of any system in 
which one or more of the elements is an aspheric plate. The procedure is one of 
taking only the aspheric terms in the aberration coefficients to get a first-order 
design and using computer analysis to refine the design. In this way one reduces 
the effort required in the theoretical analysis leading to the original design and 
uses the computer to help arrive at the final design. 

Although the single-plate corrector makes the prime focus of a Ritchey- 
Chretien primary usable, the surface of best images is sufficiently curved so that a 
field flattener lens is also needed. The sag of the image surface at a field angle of 
6 arc-min is about 220 um. This sag, in combination with blur already present in 
the off-axis image, gives an unacceptably large blur on a flat detector in focus on 
the on-axis image. Wilson (1971) gives spot patterns for an aspheric plate—field 
flattener combination for the ESO 3.6-m telescope, with acceptable image quality 
over a field diameter of about 0.25°. 

Before going on to other prime-focus correctors, it is worth noting that a single 
plate will not improve the images of a paraboloidal primary. Putting a corrector in 
the beam will, for example, introduce spherical aberration and astigmatism of 
unacceptable amounts if b is chosen to eliminate coma. The verification of this 
statement using Eqs. (9.2.1) is left as an exercise for the reader. 

Getting a larger and flatter image field at prime focus requires more complex 
correctors, of which many are discussed in the literature. Here we consider briefly 
a few of these, but without the detail given to the single-plate corrector. One kind 
of system that has been explored in detail is a set of corrector plates in series in 
the converging beam, as shown in Fig. 9.3 for three plates. Taking the aberration 
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Fig. 9.3. Schematic of three-plate prime focus corrector. 


coefficients for the mirror from Table 5.2 and those for the correctors from Table 
5.5, it is a straightforward step to the system coefficients. The results are 





B ae l sowie 
1 2 f IAES ps 
0/1 ‘ 
By =5 ape PEW : (9.2.5) 
2 1/K+1 å 
Bs, =- 5 (Sar + Bt), 


where the ith corrector is a distance W, from the primary, g; from the image 
surface, and k; = g;/f. 

For two plates there are four free parameters, b,, g;, b2, g2, and each of the 
coefficients in Eq. (9.2.5) can be made zero for a hyperboloidal primary. In this 
case the signs of the aspheric coefficients are opposite, with b > 0 for the plate 
nearer the primary. For the same primary mirror parameters as in Table 9.3, ray- 
trace analysis of a two-plate corrector shows that the field of acceptable images is 
about two times larger in diameter than that of a single-plate system. The 
curvature of the median image surface is about 10 times smaller for the two- 
plate system, though its curvature is significant over the larger field. As noted by 
Gascoigne (1973), a paraboloid with a two-plate corrector has image blurs and 
surface characteristics comparable to those of a hyperboloid with a single-plate 
corrector. 

The design for a three-plate corrector, first proposed by Meinel (1953), has 
been described in the literature and the reader should consult the references at the 
end of the chapter for details. The field is larger than that achieved with the two- 
plate corrector by about a factor of two but, as noted by Wilson, the complete 
corrector set is not easy to manufacture because of the several large aspheric 
surfaces required. Wilson points out that the optical performance of the three- 
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plate corrector system is no better than that of three-lens systems with spherical 
surfaces, a type we consider briefly in the next section. 


9.2.b. WYNNE TRIPLETS 


An alternative approach to prime focus correctors is to use lenses with 
spherical surfaces, such as the Wynne corrector for a hyperboloidal primary 
shown in Fig. 9.4. Designs of this type give fields of good images up to 50 arc- 
min in diameter for an f/2.7 mirror and a somewhat larger field for a slower 
primary. The major advantages of this type, compared to the multiplate type, are 
ease of fabrication, flatter fields, and more compact size. For the corrector shown 
in Fig. 9.4 the length L is approximately 0.06f, less than that of aspheric plate 
systems, and hence the diameters of the separate lenses are less than those of the 
aspheric plates. 

Wynne has also shown that a three-lens corrector with a paraboloidal primary 
gives fields of comparable size and image quality. A schematic of this type of 
corrector, given by Wynne for an f/3.25 primary, is shown in Fig. 9.5. The 
correctors shown in Figs. 9.4 and 9.5 are drawn to the same scale for ease of 
comparison. Although the general forms of the corresponding lenses in the two 
correctors are similar, there are obvious differences in shape and spacing. The 
interested reader should consult the papers by Wynne (1972, 1974) for further 
details on these types of correctors. 

An excellent summary of the characteristics of prime focus lens correctors, 
including spot patterns, is given by Wilson (1996). He also compares the 
chromatic properties of three-plate aspheric systems with Wynne three-lens 
systems and notes that the latter have somewhat better image quality over an 
extended wavelength range. 


> 








T 


Fig. 9.4. Wynne triplet corrector for prime focus of hyperboloidal primary. See the article by 
Wynne (1972) for characteristics of the lens elements for an f /3.25 primary. 
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Fig. 9.5. Wynne triplet corrector for prime focus of paraboloidal primary. See the article by 
Wynne (1974) for characteristics of the lens elements for an f /3.25 primary. 


9.3. CASSEGRAIN FOCUS CORRECTORS 


Of the common two-mirror telescopes discussed in Chapter 6, the Ritchey- 
Chretien type has the largest field at the Cassegrain focus. To third order, the only 
significant aberrations are astigmatism and field curvature. It was first shown by 
Gascoigne that the placement of an aspheric plate in the Cassegrain beam 
removes the astigmatism without introducing a significant amount of coma and 
spherical aberration. This plate also reduces the field curvature because, as noted 
in Section 6.2, the median image surface is more strongly curved than the Petzval 
surface. In this section we discuss the characteristics of this type of corrector for 
the Cassegrain focus. 


9.3.a. ASPHERIC PLATE 
A diagram of an RC telescope with aspheric plate in the Cassegrain beam is 


shown in Fig. 9.6, with the plate located a distance g from the focus and W from 
the telescope exit pupil. The aberration coefficients of the system, referenced to 


the primary, are 
b/g j b/g : 
Big (£) , Bo, = 5 (5) (Wy), 


AL AR (e 
s A| 2m + B) 2\¢ 


(9.3.1) 





2 
B, ) wy. 


where the astigmatism coefficient for the telescope is taken from Table 6.6, with 
Eq. (6.2.3) substituted for (K, + 1). 
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Fig. 9.6. Aspheric corrector for Cassegrain focus of Ritchey-Chretien telescope. The aspheric 
figure is similar to that of a Schmidt plate. 


To evaluate the coefficients we first express W and y in terms of the telescope 
parameters and plate location. The location of the telescope pupil is given by Eq. 
(2.6.1), and W = g — fiô, where W is negative for the plate. The relation between 
y and 0, given by Eq. (2.6.4), is y = 0(m/6). In terms of the telescope parameters 
we get 


(9.3.2) 


ws = -rof ui J | 


 fmQtP) 


A good first approximation to zero astigmatism is obtained by assuming that 
g/f <\, substituting Wy = —f@ into Eq. (9.3.1), and setting B,, = 0. The 
relation found is 


m(2m+ 1)+ B 
2m(1 + B) 


This is one relation between b and g, with the other relation found by requiring 
that the coma at a given field angle does not exceed a specified amount. 
Substituting Eq. (9.3.3) into B,, and B;, we find the following angular aberra- 


tions: 
30A (g _ A fey 
ATC = ZF (5). ASA = IEF (2) : (9.3.4) 


As an example we take the parameters for the RC telescope in Table 6.10 and, 
from Eq. (9.3.3), find A = 3.625. Assuming that ATC = 0.25 arc-sec at 0 = 0.3°, 
we find from Eqs. (9.3.4) that g/f = 0.01703 and ASA= 0.014 arc-sec. Putting 
this value of g/f into Wy in Eq. (9.3.2), and substituting Ww into Eq. (9.3.1), we 
get better values for the off-axis aberrations. The results in arc-seconds are 
ATC = 0.236 and AAS = 0.056, and all of the aberrations are clearly at a tolerable 
level. 





bfe = A. (9.3.3) 
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The one remaining calculation is that of finding b using Eq. (9.3.3). Because 
b < 0, the plate has the shape of a Schmidt plate. With a radius added to the plate 
to minimize chromatic aberration, from Eq. (7.2.4), a ray-trace analysis of the 
plate in this example shows that the image diameters are i arc-sec or less over a 
field diameter of about 1.2°. Through-focus spot diagrams at à = 550 nm and 
spot patterns at best focus for several wavelengths are shown in Figs. 9.7 and 9.8, 
respectively. It is instructive to compare the spot patterns for an uncorrected RC 
telescope in Fig. 6.3 with those in Fig. 9.7. Note especially the significantly larger 
field for the corrected RC and the presence of coma in its images. A close look at 
the off-axis images in Fig. 9.8 shows a lateral displacement away from the center 
of the field as the wavelength increases. This effect is a consequence of the plate 
thickness, in this case 10mm, and will not degrade image quality under normal 
seeing conditions. 

Because the plate in this example is in an f/10 beam, higher-order aberrations 
are negligible. The curvature of the median image surface found from ray traces is 
Km = 4.69/R,, a value about 10% larger than the Petzval curvature calculated 
from ray tracing. 
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Fig. 9.7. Through-focus spot diagrams at 4 = 550 nm for system shown in Fig. 9.6. Scale bar on 
the upper left is 2 arc-sec long. See the discussion following Eqs. (9.3.4). 
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Fig. 9.8. Spot diagrams at selected wavelengths for system shown in Fig. 9.6. Box width is 2 arc- 
sec. See the discussion following Eqs. (9.3.4). 


9.3.b. MODIFIED RITCHEY-CHRETIEN TELESCOPE 


The effectiveness of an aspheric plate in the Cassegrain beam suggests that 
still larger fields are possible if the telescope plus plate are designed as a system. 
In this case the conic constants of the primary and secondary are also adjustable 
parameters, and all of the aberrations can be made zero. Examples of such 
designs are the 1.0- and 2.5-m telescopes described by Bowen and Vaughan 
(1973) and located at Las Campanas Observatory in Chile. The design of the 1.0-m 
telescope has the additional feature of a flat Petzval field, thus the need for 
bending a photographic plate or arranging arrays of CCD detectors to match a 
curved median image surface is avoided. Well-corrected fields over 2° in diameter 
are achieved with these designs. 

The first step in the procedure for designing a flat-field Ritchey-Chretien to 
cover a wide field is to specify zero Petzval curvature for the telescope. Thus 
p = 1, as is evident from Eq. (5.7.17). This condition, in turn, requires that 


m —1=m(1+4+f). (9.3.5) 


Substitution of Eq. (9.3.5) into Eqs. (6.2.3) and (6.2.4) gives the conic constants 
of the mirrors, and all of the telescope parameters are now specified. If, for 
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example, we choose $ = 0.2, then m = 1.7762, K, = —1.4912, K, = —26.905, 
and k = 0.4338. From these results we see that the mirrors, especially the 
secondary, are strongly hyperbolic and that the obscuration of the secondary is 
larger than that of the typical RC telescope. 

The next step in the design is simply one of substituting m, B, and a choice of 
g into Eq. (9.3.3) and finding a first-order solution for b. The telescope parameters 
and the values of b and g are then the starting point for computer optimization of 
the complete system, telescope plus corrector plate. 


9.4. CASSEGRAIN FOCAL REDUCERS 


A focal reducer is an optical system whose function is to change the focal ratio 
of a telescope. It is most often used at the Cassegrain focus to reduce the focal 
ratio so that a given field can be placed on a detector of smaller area. In this 
section we discuss the general characteristics of focal reducers used at the 
Cassegrain focus. With the exception of a Schmidt camera example for a 
Ritchey-Chretien telescope, we omit the details of specific designs. 


9.4.4. GENERAL CONFIGURATION 


A schematic of a Cassegrain focal reducer is shown in Fig. 9.9. Its components 
include a field lens at the Cassegrain focus to image the exit pupil of the telescope 
onto the aperture stop of the focal reducer, a collimator to render the light parallel, 
and a camera. Other optical elements, such as a grating or filter, can be put in the 
space between the collimator and camera. Because such elements are located in a 
collimated beam, they introduce no additional aberrations. 

The diameters of the focal reducer components depend on the field to be 
covered. If the angular radius of the field on the sky is @, then the diameter of the 
field lens is 20, where f is the telescope focal length. The diagram in Fig. 9.10 
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Fig. 9.9. Schematic of focal reducer where c and c’ denote collimator and camera, respectively, 
and FL is the field lens at the telescope focus. 
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Fig. 9.10. Schematic of focal reducer in relation to telescope exit pupil EP. See text, Section 9.4, 
for definitions of symbols. 


shows the chief ray from the center of the telescope exit pupil for an object at the 
edge of the field. Assuming the aperture stop of the focal reducer is at the 
collimator lens, the angle x at which this chief ray enters the collimator is given 
by 


of, = Yh ô =f, (9.4.1) 


where f, is the focal length of the collimator, f, is the focal length of the primary, 
and Eq. (2.6.4) is substituted to eliminate y. 

We also see from Fig. 9.9 that D,, the diameter of the collimator lens, is f./F. 
Therefore 


a/0 =f /f, = D/D.. (9.4.2) 


For a real lens pair the stop is often located in the space between the lenses, but 
the distance d in Fig. 9.9 is usually small compared to f, and, to a good 
approximation, the stop is effectively at the collimator. If d is small, the diameters 
of the collimator and camera lens are nearly equal, and Eq. (9.4.2) can be used to 
find the diameter of either. 

For a given D and 0, we see from Eq. (9.4.2) that a smaller D, implies a larger 
a. A larger value of «, in turn, generally means that the design of the lenses in the 
focal reducer is more difficult. We also see that the size of the focal reducer scales 
directly with the size of the telescope for a given « and 8. 

The focal length of the telescope-focal reducer combination is f of the 
telescope times the magnification of the focal reducer. The reader can verify 
that the magnification of the focal reducer is the ratio of the camera to collimator 
focal lengths or focal ratios, hence f of the combination is the diameter D of the 
telescope times the focal ratio of the camera. 

As an example, let 0 = 0.5°, « = 20°, and D = 1 m. Substituting these values 
into Eq. (9.4.2) gives D, = 25 mm, and a well-designed commercial camera lens 
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is appropriate for the camera lens of the focal reducer. If F = 10 for the telescope, 
then f, = 250 mm is the focal length of the collimator, and the constraints on its 
design are relatively modest compared to those on the camera lens. With the same 
telescope magnification and angles, but D = 4 m, the dimensions of the lenses 
are 4x larger and their design and mounting is a more difficult problem. 


9.4.6. TYPES OF FOCAL REDUCERS 


A variety of focal reducer types have been analyzed by a number of 
investigators, with a good summary of these given by Wilson (1971). His 
paper includes examples of multilens systems that convert an f/8 telescope 
beam to f/3 with good image quality of a field 0.9° in diameter. The main 
difficulty with lens systems, as noted by Wilson, is the chromatic aberration over 
an extended wavelength range. 

Catadioptric systems have the advantage of smaller chromatic aberration but 
the disadvantage of obstruction due either to the detector or one of the mirrors. 
Wilson describes briefly some catadioptric systems that have been proposed for 
focal reducers, such as the standard Schmidt, Bouwers-Maksutov, and Schmidt- 
Cassegrain cameras, where each is used with a field lens to reimage the telescope 
exit pupil. The text by Wilson (1996) should be consulted for further details and 
references. 

Meinel, Meinel and Wang (1985) have described a four-mirror focal reducer 
for the Nasmyth focus of a large telescope with good imagery over a field radius 
of 8 arc-min. Their paper should be consulted for details. 


9.4.c. EXAMPLE: SCHMIDT CAMERA 


To illustrate the approach to the design of a focal reducer, we consider a 
Schmidt camera modified for the required conditions. The basic configuration 
adopted, field lens plus camera, is shown in Fig. 9.11. The field lens images the 
telescope exit pupil on the aspheric plate with the chief ray shown entering the 
camera at angle Y, where w/0@ = f /d for a telescope of focal length f with field 
angle 0. We assume a Ritchey-Chretien telescope and adjust the camera para- 
meters to eliminate the astigmatism present at the RC focal surface. 

The RC telescope is free of coma and spherical aberration, while the aspheric 
plate has no coma and astigmatism when it is at a pupil. Thus the coma of the 
system is that of the mirror only, with the coma coefficient given in Table 5.6. 
Setting this coefficient to zero gives 


W m-1\] 
t-j- l (9.4.3) 
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Fig. 9.11. Schematic of Schmidt focal reducer. See text, Section 9.4, for discussion. 


where K is the conic constant of the mirror. As we will show shortly, K must be 
chosen nonzero to eliminate the telescope astigmatism. 

The system spherical aberration is that of the aspheric plate and mirror, while 
the astigmatism is that of the telescope and focal reducer mirror. Taking the 
appropriate coefficients for the corrector and mirror from Tables 5.5 and 5.6, and 
the telescope astigmatism from Table 6.6, we get 


2 2 2 2 
By. = BRE) + 2a) 9 K+ ( -7) | (9.4.4) 


_ bf 4 n m+1\* 
B3, = B) tæl (2) À (9.4.5) 


where the subscripts 1, 3, and 4 refer to the telescope primary, aspheric plate, and 
camera mirror, respectively, and n = 1 for the mirror. In writing these system 
coefficients we assume the field lens and aspheric plate thickness and radius do 
not contribute to the aberrations. 

The terms in Eqs. (9.4.4) and (9.4.5) are simplified by noting that y3/y4 = d/s, 
y4/y, = s/f, and w = 0( f/d), where s is the distance from the field lens to the 
camera mirror. 

The procedure is now one of substituting Eq. (9.4.3) into Eq. (9.4.4), setting 
Eq. (9.4.4) to zero, and solving for K in terms of B,(RC). Letting 
B (RC) = -PT /2f , where T is the quantity in brackets in AAS in Table 6.9, 


the result is 
TR (m+1\" rr\! 
K =— | — 1—-—] . 9.4.6 
7 (mai) (37) 046 


The value of K from Eq. (9.4.6) is substituted into Eq. (9.4.3) to find W/R, which, 
in turn, is used to find the ratio d/s. In terms of the camera parameters we find 


A ole, LS ia (9.4.7) 
Ss sS R\m-1 
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Values derived from Eqs. (9.4.6) and (9.4.7) are substituted into Eq. (9.4.5), 
which is solved for b after setting B}, to zero. 

All of the relations needed to specify the Schmidt focal reducer (SFR) are now 
in hand. For the telescope we take the design parameters of the 1.5-m f/8 
telescope shown in Table 9.4. For the SFR we assume a final focal ratio of 2.67, 
hence m = —1/3 for the camera, with s = —2000 mm and R = —1000 mm. 

With these parameters we find I = 2.67, K = —0.025, W=1.053R, 
d/s = 0.4737, b = —8.937E-9 for an SiO, corrector at 4 = 500 nm. A listing 
of all the SFR parameters, including the field lens, is shown in Table 9.5. 

Ray traces of the system with the nominal SFR parameters given in Table 9.5 
show an image diameter of about | arc-sec at a field angle of 0.5° and wavelength 
of 500 nm. With a 2-mm change in the corrector location and a 10% increase in 
the focal length of the field lens, the image diameter is reduced to approximately 
0.25 arcsec at the same field angle and wavelength. Over the range from 320 to 
1000 nm, the image diameter is 0.5 arcsec or smaller at the edge of the field, 
hence the broadband image quality is satisfactory. The image surface is curved 
with a radius of curvature of —950 mm. 

Although this type of camera would appear to be an obvious choice for a focal 
reducer, it has several disadvantages. One problem is its curved focal surface, but 
a field flattener lens can be added to remove this curvature. A much more serious 
problem is the location of the focal surface inside the camera. It is not possible to 
locate large detector systems such as cooled solid-state arrays at an internal focus 
without vignetting most of the light before it reaches the mirror. One way of 
getting an external focus is a folded Schmidt camera with a tilted plane mirror 
between the corrector and sphere, as shown in Fig. 15.7, and the detector behind a 
hole in the plane mirror. The size of the hole and the position of the detector 
determine the amount of vignetting, and for a large field this is significant. 
Schmidt-Cassegrain cameras of the type described in Chapter 8, modified to 
reimage an object surface at a finite distance, also have an external focus, but 
vignetting by the camera’s secondary mirror can be a shortcoming of this type of 
focal reducer. An example of a Schmidt-Cassegrain focal reducer with field- 
flattening optics is described by Opal and Booth (1990). 


Table 9.4 


Parameters of 1.5-m Ritchey-Chretien Telescope 


Overall: m = 2.667, k = 0.3273, p=0.2 


f =12.0m, F=8 
Primary: R, = —9000 mm, K, = —1.1368 


Secondary: R, = —4712.7 mm, K, = -6.5524 
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Table 9.5 


Parameters of Schmidt Focal Reducer 


Overall: m = —0.333, s = —2000 mm 
Mirror: R= —1000 mm, K = —0.025 
W = 1.053R (nominal) 
= 1.055R (optimized) 


Corrector: b = —8.937E-9, E =2.416E-9 
R, = —35 320 mm, thickness = 10 mm 
Field Lens: plano-convex, thickness = 18 mm 
Ry = —380 mm (nominal) 


= —420 mm (optimized) 
Lens and corrector material: SiO, 


9.5. ATMOSPHERIC DISPERSION CORRECTORS 


We now turn our attention to a different type of corrector, one that compen- 
sates for the dispersion of the Earth’s atmosphere. This effect, discussed in 
Section 3.6.a under the name of differential atmospheric refraction, is a 
consequence of the wavelength-dependent index of refraction of the atmosphere. 
A curve of the differential refraction at 45° zenith angle over the wavelength 
range 340-1000 nm is shown in Fig. 9.12. This curve is based on a relation from 
Allen for the atmospheric conditions given in Table 3.1. The scale on the vertical 
axis in Fig. 9.12 is set to zero at 2 = 435 nm, a choice approximately centered in 
the range shown. For other zenith angles the scale is simply expanded or 
contracted according to Eq. (3.6.3). 

The device that compensates for this effect is called an atmospheric dispersion 
corrector or ADC. In the absence of an ADC the image of a star with a ground- 
based telescope is a short, vertical spectrum, especially noticeable if the angle 
between the telescope axis and zenith is large. With an ideal ADC this same 
image shows no dispersion at any zenith angle, as well as no large displacement 
from a nominal position on the detector. Thus there are two basic requirements 
for an ADC: (1) variable dispersion to compensate that of the atmosphere at a 
given zenith angle; and (2) zero-deviation at some mean wavelength, denoted by 
Ao, Within the range of interest for all zenith angles. 

The first of these requirements suggests counterrotating prisms with dispersion 
a maximum (minimum) when the apex angles of the prisms are in the same 
(opposite) directions. Two prisms are sufficient to satisfy this requirement, but 
they cannot satisfy the zero-deviation condition unless each prism by itself is a 
zero-deviation unit, a pair of prisms with different dispersions and oppositely 
directed apex angles. Thus an ADC is a set of four prisms paired to satisfy the 
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Fig. 9.12. Differential atmospheric refraction at a 45° zenith angle. See Table 3.1 for the 
atmospheric conditions and Section 3.6 for discussion. 


given requirements. Because the required dispersion is small, each prism in an 
ADC can be considered “thin” and the paraxial approximation is adequate for the 
analysis. Schematic diagrams of an ADC, with angles exaggerated for clarity and 
the doublet pairs separated, are shown in Fig. 9.13. The dispersion of the ADC is 
a maximum in Fig. 9.13a and zero in Fig. 9.13b. 


fi 


(a) (b) 


Fig. 9.13. Schematic diagrams of atmospheric dispersion correctors: (a) maximum dispersion, (b) 
zero dispersion. Angles are exaggerated for clarity. 
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For a single thin prism of index n the deviation 6 = (n — 1)y, as shown in Fig. 
4.12. The condition for zero deviation for a pair of prisms is 


(nı — 1}; = (m — Dvr, (9.5.1) 


where both apex angles are taken positive. The prisms in each doublet in Fig. 
9.13 are cemented, both for mechanical stability and ease of handling and for 
maximum light transmission. 

From Eq. (9.5.1) we see that the choice of two glasses with the same indices of 
refraction at A) gives equal apex angles and plane-parallel opposite faces. In this 
case light of one wavelength is neither deviated nor displaced laterally, hence the 
pointing of the telescope is unaffected by inclusion of the ADC in the light beam. 
In practice it is difficult to find suitable glasses to satisfy this condition. The usual 
approach, and the one followed here, is to take glasses with adequate transmission 
in the near ultraviolet and a large difference in dispersive powers, and to accept a 
small amount of lateral displacement at the detector. 


9.5.4. EXAMPLE: ADC IN COLLIMATED LIGHT 


As an example we consider the glasses UBK7 and LLF6 from the Schott 
catalog. We choose the wavelength of zero deviation 4) = 435 nm and find the 
indices n(UBK7) = 1.52675 and n(LLF6) = 1.54559. Given these indices we 
use Eq. (9.5.1) to find the ratio of the prism angles. The selection of the individual 
prism angles depends on the telescope and configuration in which the ADC is 
used; for this example we choose 1.506° and 1.454° for the UBK7 and LLF6 
prisms, respectively. These angles are the same as those used by Wynne (1984) in 
the design of an ADC for the 4.2-m William Herschel telescope. 

Our chosen layout of each of the doublet prisms in Fig. 9.13a,b is UBK7 on 
the left, facing the incident light, with the interface between the prism pairs 
perpendicular to the axis of a telescope. To ensure maximum light transmission, 
the doublet prisms are placed in contact, with an oil film between them to allow 
rotation of one relative to the other. Each individual prism is given a central 
thickness of 10mm and collimated light incident on the ADC is focused by a 
perfect thin lens with f = 1000 mm following the ADC. 

Results of ray traces for selected wavelengths are shown in Fig. 9.14 for the 
ADC configuration in Fig. 9.13a. Note that the wavelengths are spaced at 
constant intervals of 70nm, with decreasing dispersion as the wavelength 
increases, similar to that of the atmosphere as shown in Fig. 9.12. Not shown 
in Fig. 9.14 is the lateral displacement for A) of about 0.5 mm, a consequence of 
the wedge angle of 0.052° for each half of the ADC. If the configuration of the 
ADC is changed to that of Fig. 9.13b, then all wavelengths are superposed at Ay in 
Fig. 9.14 and the lateral displacement is nearly zero. 
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Fig. 9.14. Ray traces of wavelengths at 70 nm intervals of collimated light through ADC shown in 
Fig. 9.13(a). Glasses are UBK7 and LLF6. See the text for prism parameters and discussion. 


The example here is simply to illustrate how an ADC is configured and to give 
dispersion results for typical glasses and angles. In practice, an ADC is designed 
to match a particular telescope such that maximum dispersion is obtained at the 
maximum desired zenith angle. At smaller zenith angles the two separate doublet 
prisms are rotated in opposite directions to reduce the net dispersion but maintain 
a vertical direction for this dispersion. The net dispersion is simply the vector sum 
of the separate dispersions of the individual prism pairs. As an example, if each 
pair is rotated 45° from the configuration for maximum dispersion, the net 
dispersion is reduced by a factor of ./2. 


9.5.b. EXAMPLE: ADC IN CONVERGENT LIGHT 


The preceding example with collimated light illustrates the principles of an 
ADC without the complications of aberrations. Most often, however, an ADC is 
located in a convergent beam as, for example, in a Cassegrain telescope. In this 
case aberrations are introduced by the prisms and a more detailed analysis of the 
telescope plus ADC is required to ensure that image quality over the desired field 
is not seriously degraded. An example of such a detailed analysis is given by 
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Wynne and Worswick (1986) for the f/11 Cassegrain beam of the William 
Herschel telescope; the interested reader should consult their paper. 

Here we are interested in the aberrations introduced by the ADC without the 
added complication of telescope aberrations. We take the same UBK7-LLF6 
combination as in the previous example and locate the incident face of this ADC 
in a converging f /4 beam 500 mm from focus. Ray traces with the orientation of 
the ADC prism pairs at maximum dispersion give the spot patterns shown in Fig. 
9.15 for three field angles and four wavelengths. Figure 9.16 shows spot patterns 
for the configuration of minimum dispersion. 

There are several features of interest in Figs. 9.15 and 9.16, and we consider 
them in turn. First, and most obvious, is the variation of image size with 
wavelength, hence a chromatic focal error. Second, asymmetry due to coma is 
evident at off-axis field positions for both configurations. The final feature worth 
noting is that on-axis images are symmetric in the case of minimum dispersion 
but slightly comatic in the configuration for maximum dispersion. The source of 
these asymmetries is discussed in the following section. 

The chromatic focal error is most clearly seen in through-focus spot diagrams, 
as shown in Fig. 9.17 at zero field angle for the ADC set for maximum 
dispersion. As shown by Wynne and Worswick, this error is eliminated by 
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Fig. 9.15. Spot diagrams for selected wavelengths and field angles of f /4 beam through ADC 
with maximum dispersion. See the text for prism parameters and discussion. 
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Fig. 9.16. Spot diagrams for selected wavelengths and field angles of f /4 beam through ADC 
with minimum dispersion. See the text for prism parameters and discussion. 


putting a slight curvature on the interface between the zero-deviation units. For 
this example, putting a radius of curvature of 5m on each of the oiled contact 
surfaces gives the through-focus patterns shown in Fig. 9.18. The first prism pair 
of the ADC is a weak converging lens and the second is a weak diverging lens. 
Because of the power added to each half of the ADC, a refocus of about 1 mm is 
also required. 

It is also worth comparing the linear dispersions at the detector for these two 
examples. The angular dispersion of the ADC is constant, but the linear 
dispersion is directly proportional to the effective lever arm, 1000mm in the 
first example and 500 mm in the second. 


9.5.c. ABERRATIONS OF PRISMS IN CONVERGING BEAM 


We now consider the aberrations introduced when prisms are placed in a 
converging beam and examine, in turn, a single prism, a doublet prism, and a 
complete ADC at different dispersion orientations. For either a single prism or 
prism combinations, the effects due to thickness and angles can be separated as 


9.5. Atmospheric Dispersion Correctors 231 





REFERENCE ! CHIEF RAY 


Fig. 9.17. Chromatic focal error for ADC in f/4 beam set for maximum dispersion. All of the 
prism faces are plane. Wavelengths from top to bottom are 400, 435, 500, and 650 nm. 
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Fig. 9.18. Through-focus spot diagrams for ADC in f/4 beam. The surfaces between the front 


and back pairs of the ADC have a radius of curvature of 5m. Wavelengths from top to bottom are 400, 
435, 500, and 650 nm. 
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any prism is essentially a plane-parallel plate plus a thin wedge. The effect due to 
the thickness is considered first. 

The aberration coefficients for a plane-parallel plate of thickness t and index n 
are given in Eqs. (7.2.10)—(7.2.12), with 8, the angle of the chief ray at the first 
surface. Substituting these relations into Eq. (5.4.1) gives the results in Table 9.6, 
transverse aberrations for a plate in a converging beam. It is evident from the 
results in Table 9.6 that the aberrations are more significant for thicker plates (and 
multiple prisms as in an ADC) and faster beams. 

The aberrations of a single thin wedge are derived in Section 15.6 and we take 
the important results from that discussion. Note that the coefficients for a wedge 
given in Eqs. (15.6.5) and (15.6.6) are simply the sum of the surface coefficients 
because the wedge is thin and the beam size is the same at both surfaces. Note 
also that these coefficients are expressed in terms of € and y, where the angle of 
incidence at the wedge of angle y is 0; = ey. 

We substitute the coefficients in Eqs. (15.6.6) and (15.6.5) into Eq. (5.4.1) to 
get transverse coma and astigmatism, with these relations given in Table 9.7. Note 
that the relation for coma in Table 9.7 is independent of £, hence it does not 
depend on the orientation of the wedge. Astigmatism, on the other hand, does 
depend on the wedge orientation. 

The procedure used to find the aberration coefficients for a single prism is 
easily extended to that of a cemented double prism. As with a single prism, the 
pertinent coefficients are those of coma and astigmatism from Table 5.1, now 
written for each of the three surfaces. Because the wedge pair is thin, the sum of 
the surface coefficients is the coefficient for the wedge. We outline this procedure 
for coma. 


Table 9.6 


Transverse Aberrations of Plane-Parallel 
Plate in Converging Beam 





wy 


TSA = 44y,)°5' = Tears 








, 3(n? — 1) t , 
TTC = 34,375! = -0, s 





2 
2 (nt — l(t), 
TAS = 24,38" = Ú SF -)s 

Table Symbols: t = thickness of plate of 
index n; s & s; F = s/2y = focal ratio of 
converging beam. 
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Table 9.7 


Transverse Aberrations of Single Thin 
Prism in Converging Beam 








3(n? — 1) 
TIC = 34,,)°s" = —y SnF? s 
2 — 1) 2e 
TAS = et a(n 1 1 
S = 2A wys y oF E s 


Table Symbols: y = angle of wedge; angle of 
incidence =y; F = s/2y= focal ratio of 
converging beam; s’ = distance from prism 
to focus. 


Let the indices of refraction and wedge angles for the first and second prisms 
be denoted by nı, y,, and n, Yp, respectively, and let the surfaces in order be 
numbered 1 through 3. The coma coefficient for the prism pair in air is 


: 0, (lL—ni\ nb (hn m0, 
A = —— + —— + — a a Jp 2 
2 (pair) 2s? ( n? ) 2s3 ( n ) 2s3 (2-1), Baa 








where s, = n,S, and s, = ms,. Although Eq. (9.5.2) can be evaluated for an 
arbitrary angle of incidence at the first surface and an arbitarary prism pair, we 
consider only the special case where 0; = y; — Y2, 02 = —(n2/n,)72, 03 = 0, and 
Eq. (9.5.1) applies. The prism described by these restrictions is the first half of the 
ADC shown in Fig. 9.13a,b. Evaluating Eq. (9.5.2) with these conditions gives 
the coma coefficient for a zero-deviation doublet prism as 


_ hm —n)(m — 1) 


2 , 


Ay = 
2ni N78} 


z (9.5.3) 
where the subscript z denotes zero-deviation. Following the same procedure for 
the astigmatism gives 

A 75 (n -nXm — 1) 


Ay. = 9.5.4 
iz n(n — 1)sy ( ) 





Evaluation of Eq. (9.5.2) for an arbitrary angle of incidence gives Eq. (9.5.3) for 
any angle within the range over which the paraxial approximation is valid. A 
similar analysis for astigmatism gives a relation in which the dependence on the 
orientation of the prism pair is a factor. It turns out on further analysis that this 
dependence is only a minor factor for the range of angles likely to be encountered 
for an ADC in a typical telescope, hence Eq. (9.5.4) is acceptable as a measure of 
the astigmatism coefficient for all small angles of incidence. 
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Following the usual procedure of substituting the aberration coefficients into 
Eq. (5.4.1) we get the transverse aberrations for a zero-deviation doublet prism in 
Table 9.8. It is worth noting again that the contributions of coma and astigmatism 
given in Table 9.8 for a single doublet prism are constant over the field and do not 
include the prism thickness. 

We are now in a position to determine coma and astigmatism for a full ADC. 
Consider first the coma due to the prism effect. When the dispersions add, as in 
the configuration in Fig. 9.13a, the comas of the pairs add and the resultant is 
twice that given in Table 9.8. When the configuration is that shown in Fig. 9.13b 
the comas of the pairs have opposite signs and the coma due to the wedge effect is 
zero. At other dispersions the coma is reduced from its maximum value by an 
amount equal to the fractional reduction in the dispersion. 

For astigmatism due to the prism effect the aberration is a maximum when the 
configurations are as shown in Fig. 9.13a,b, and is zero when the prism pairs are 
at right angles with respect to one another. 


9.5.d. DISCUSSION OF ABERRATION RESULTS 


For actual thin prisms, such as those used in the preceding examples, the net 
aberrations are simply the sum due to both thickness and wedge effect. For an 
ADC configured as in Fig. 9.13a, the net transverse coma is the sum of TTC from 
Table 9.6 and 2 TTC from Table 9.8. A similar sum gives the net astigmatism for 
an ADC in this configuration. 

We now calculate the various transverse aberrations at 2 = 435 nm and # = 1° 
for the ADC giving the spot patterns shown in Fig. 9.15. All of the following 
results are given in microns. For a total central thickness of 40mm we find 


Table 9.8 


Transverse Aberrations of Zero-Deviation 
Doublet Prism? in Converging Beam 


TIC = 34s = —y, mim =D y 


8n\ nF? 

(m — n(n, — 1) 

TAS = 2A, ys’ = p 2 12 U 
LYS = 72 n(n, —1)F 


* zero-deviation condition: 
(a — 1) = (m — Dy. 

Table Symbols: F = s/2y= focal ratio of 
converging beam; s’ = distance from prism to 
focus. Angular aberration subtended on 
sky = TA/f, f = telescope focal length. 
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TSA = 14.6, TTC = —6.0, TAS = 0.6; for the wedge effect we get TTC = —2.6 
and TAS =2.0. The only difference for the spot patterns in Fig. 9.16 is that 
TTC =0 for the wedge effect. 

From these numbers we see that coma due to thickness is more significant than 
that due to the wedge effect. We can use the signs and relative sizes of the coma 
numbers to account for the differences between positive and negative 0 seen in 
Figs. 9.15 and 9.16. We also see that spherical aberration is a factor in this f/4 
beam, but that astigmatism is relatively unimportant. 

The sizes of the computed aberrations suggest that the example ADC could be 
used in a beam faster than f/4 without undue image degradation. This is not the 
case, however, because the elimination of the chromatic focal error by putting a 
radius of curvature at the oiled contact introduces a significant amount of 
spherical aberration, as is evident from the images at best focus in Fig. 9.18. 
Because of this curvature factor, the use of an ADC with plane outer faces in a 
converging beam is limited to beams no faster than f/5 or f/6, with detailed 
analysis based on ray traces required over the entire range of wavelengths to 
ensure adequate image quality. 


9.5.e. EXAMPLE: ADC IN RITCHEY-CHRETIEN TELESCOPE 


The previous examples illustrate the principles of an ADC in a collimated 
beam and a converging beam, but do not relate the characteristics of an ADC, 
angles and locations in the beams, to the correction of atmospheric dispersion. 
Doing the latter requires selecting a telescope and maximum zenith angle at 
which atmosphere refraction is corrected. We choose a 3.6-m RC telescope with 
the parameters given in Table 6.10, but with F, = 1.5 and F = 6, and a maximum 
zenith angle % = 68.2° (tana = 2.5). We again use UBK7 and LLF6 glasses. 

With the needed parameters in hand, we proceed to match an ADC to the 
chosen telescope. From Fig. 9.12 we get a differential atmospheric refraction of 
1.40 arc-sec between 400 and 650 nm when the zenith angle is 45°. Scaling this to 
the maximum zenith angle by using Eq. (3.6.3) gives a maximum difference in 
refraction of 3.5 arc-sec between 400 and 650 nm. The telescope in this example 
has a focal length f = 21.4 m and a scale of 105 um/arc-sec. Thus the ADC in 
the Cassegrain beam must give a maximum separation at the focal surface of 
about 366 um between these wavelengths to compensate for the atmospheric 
dispersion. 

Our previous example, located 500mm from focus, gives a separation of 
150 um between 400 and 650 nm. Increasing this by the required factor of 2.44 
means either: (a) an increase in the prism angles by this factor with no change in 
distance from focus, (b) an increase in distance from focus by this factor with no 
change in angles, or (c) some combination of the two. Choosing between these 
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options requires analysis of image quality over the desired field, an exercise not 
done here. 

For this example we choose option (b) and place the first surface of our 
example ADC at 1220 mm from the nominal telescope focus. Because the ADC 
is farther from focus, hence larger in diameter, we give each prism a central 
thickness of 20 mm. From ray traces we find that a radius of curvature of 15 m at 
the interface between the doublet prisms eliminates the chromatic focal error. The 
resulting spot patterns at three field angles and four wavelengths are shown in Fig. 
9.19. Spot patterns for the telescope without an ADC are roughly comparable to 
those for A = 500 nm. Although the images with an ADC are larger than those of 
a bare telescope, the latter is restricted to small zenith angles or narrowband filters 
to achieve comparable image quality. 


9.5.f. CONCLUDING REMARKS 


For faster beams, such as at prime focus, spherical aberration and coma due to 
the thickness of the prisms makes the images unacceptably large. Wynne (1986) 
has pointed out that it is possible to design an acceptable ADC if each equivalent 
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Fig. 9.19. Spot diagrams for selected wavelengths and field angles for a 3.6-m RC telescope with 
an ADC in the f/6 Cassegrain beam. Box width is | arc-sec. See the text for discussion. 
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plane-parallel plate is replaced by a meniscus lens. Except for the wedge added to 
make each lens a prism, the curved surfaces of these lenses have a common center 
of curvature at the on-axis focus of the telescope. The interested reader should 
consult the article by Wynne for further details. A design of an ADC in an f /1.6 
beam in a wide-field 3-mirror telescope has been given by Willstrop (1987). 

Finally, it is important to note that the differential dispersion produced by an 
ADC is not generally a mirror image of that due to the atmosphere. As a 
consequence the exact compensation at a pair of wavelengths may leave residual 
dispersions at other wavelengths. Wynne and Worswick show that the residual 
dispersion for the UBK7-LLF6 combination is approximately 0.4 arc-sec for a 
zenith angle % = 71.6°. This residual scales as the tangent of the zenith angle 
and is 0.13 arc-sec at a = 45°. 


9.6. FIBER OPTICS 


With the development of high-efficiency optical fibers to pipe light from one 
point in space to another, astronomers have found a tool that is revolutionizing 
spectroscopy. Individual fibers placed on stellar sources at a telescope focal 
surface have their opposite ends aligned along the slit of a spectrometer, and the 
result is multiple-object spectroscopy (MOS). By this means, the observing 
efficiency of a telescope is increased dramatically. In this section we outline 
the characteristics of optical fibers, especially those of importance to astronomers. 
Our discussion follows an excellent review article by Barden (1995). 

The type of fiber most often used is a multimode, stepped-index fiber. This 
kind of fiber consists of a high index of refraction glass core surrounded by a 
sheath of a lower index glass called a cladding. Core diameters used in MOS are 
generally in the range of 50 to 500 um, and the thickness of the cladding is 
typically about one-tenth that of the core. Light is guided through the core by 
total internal reflection that takes place at the interface between the core and 
cladding. A plastic outer coating called a buffer protects the glass fiber. A 
common way of giving the size of a fiber is core/cladding/buffer in um, such as 
200/220/240. Characteristics of particular interest in MOS are spectral transmit- 
tance, focal ratio degradation, and image scrambling. We now consider each of 
these in turn. 

Transmission as a function of wavelength depends in large part on the OH 
content of the fiber core. Fibers with high OH content, so-called wet fibers, have 
poor transmittance in the red and near-infrared but are good transmitters in the 
near-ultraviolet. Dry fibers with low OH content have excellent transmittance in 
the 1-2 um range, but are not suitable for wavelengths shorter than about 500 nm. 
It has been found that hydrogen-doped dry fibers have good transmittance from 
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the infrared and to wavelengths as short as 400 nm. Typical fiber lengths required 
with large telescopes are in the 20-m range. The total transmittance over this 
length, including a reflection loss of 4% at each end, is about 90% for a dry fiber 
at 1 um. Curves of transmittance as a function of wavelength can be found in the 
cited article by Barden. 

The transfer of the light from a star at the focus of a telescope via a fiber to a 
spectrometer is most efficient if the output focal ratio from the fiber (Fout) is equal 
to the input focal ratio at the telescope focus (Fin). It turns out that all fibers tend 
to increase the cone angle of the output beam compared to the input beam, that is, 
Fat < Fin, an effect called focal ratio degradation (FRD). This degradation 
depends on several factors, including mechanical stresses induced by bending 
that deforms the cylindrical shape of the fiber. An excellent review of FRD, 
including a discussion of the mechanisms responsible for FRD, is given by 
Ramsey (1988). 

Extensive measurements of FRD on a variety of fibers show that faster input 
focal ratios are more nearly preserved than slower ones. Figure 9.20 shows typical 
results for Foy plotted as a function of Fp. Each curve indicates the fraction of 
light collected by the collimator of a spectrometer for different input focal ratios. 
It is clear from the results in Fig. 9.20 that the normal design of a spectrometer 
with Foi = Fie is not appropriate for a fiber-fed spectrometer. The options are to 
increase the diameter of the collimator to capture most of the light from the fiber 
or live with substantially less dispersed light because of light loss at the 


F(out) 





10 12 14 


8 
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Fig. 9.20. Representative curves for focal ratio degradation (FRD). F(in) is the beam focal ratio at 
a fiber input; F(out) is the focal ratio of the emerging beam for different fractions collected at that focal 
ratio. Fraction collected: Lowest curve, 1.0; Dot-dot curve, 0.95; Upper solid curve, 0.90. 
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collimator. As pointed out by Ramsey, either represents a loss for spectroscopy. A 
full discussion of the effect of FRD on resolution-throughput products is found in 
Section 12.3. 

The final characteristic we consider is that of image scrambling, the mixing of 
the input light both radially and azimuthally by a fiber to produce a uniform 
output beam. In effect, the output from a fiber has no memory of the distribution 
of light on the input end of the fiber. This is especially important for spectro- 
meters used to measure wavelengths to very high precision, as in the search for 
planets and other faint stellar companions from measures of changing radial 
velocities. The interested reader should consult the excellent review by Heacox 
(1995). 

This section is a brief introduction to a large topic. The work in this field is 
well covered in conference proceedings edited by Barden (1988) and cited at the 
end of this chapter. 
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Chapter 10 Diffraction Theory and Aberrations 


The discussion of telescopes and their aberrations in previous chapters is 
entirely from the point of view of geometric optics. This approach is one in which 
the ray is a well-defined entity, with the wavelength in the geometric optics limit 
effectively zero. The paths of rays through an optical system are governed by 
Fermat’s principle and aberrations occur when rays do not pass through the 
paraxial image point. An aberration-free image in the geometric optics limit is, 
according to Fermat’s principle, a true point image. It was pointed out in Section 
3.7, however, that the wave nature of light sets the image size for an otherwise 
perfect or diffraction-limited optical system, with the analysis there intended only 
to give an estimate of the size of the diffraction image. 

In this chapter and the next the emphasis is on the character of the perfect 
image from the point of view of diffraction theory. Because no optical system is 
strictly perfect, we also consider the effect of the aberrations of a nearly perfect 
optical system on the diffraction image. Our analysis proceeds along two lines. In 
this chapter the starting point is Huygens’ principle and the superposition of 
waves from points on a wavefront. In the following chapter the analysis is in 
terms of transfer functions, with application to the imaging capability of the 
Hubble Space Telescope (HST) expected before its launch. 

As part of our discussion of the near-perfect image, we generalize our 
representation of its characteristics in terms of transverse aberrations and 
introduce orthogonal aberrations in terms of Zernike polynomials. With this 
representation we find that giving image quality in terms of root mean square 
(rms) wavefront error is especially informative. Our discussion includes a 
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comparison between image size for the near-perfect image found by diffraction 
theory and that from geometric aberration theory as given in Chapter 5. 

Before discussing the nature of a perfect image as formed by a telescope with a 
circular or annular aperture, we discuss Huygens’ principle and its extension by 
Fresnel. This principle is the basis for diffraction theory and the Fresnel- 
Kirchhoff diffraction integral. We first apply this theory to a rectangular aperture. 
The mathematics is a bit simpler in this case and the results, although of interest 
in their own right, are especially useful in situations where the rectangular 
aperture reduces to a narrow slit, as in many types of spectrometers. 

Our analysis is limited to the special case of Fraunhofer diffraction and 
parallels that given in most optics texts, such as those by Hecht (1987) or Born 
and Wolf (1980), with the notation basically that of the latter authors. 


10.1. HUYGENS-FRESNEL PRINCIPLE 


The initial statement of Huygens’ principle was made in an attempt to 
understand the laws of reflection, refraction, and the propagation of light. He 
started with the assumption that light was a wave and could be described in terms 
of wavefronts. From the point of view of Fermat’s principle, a wavefront is a 
surface on which every point has the same optical path distance from a point 
source of light. Viewed as a wave, a wavefront is a surface on which every point 
has the same phase. Huygens postulated that at a given time each point on a 
primary wavefront acts as a source of secondary spherical wavelets, and that the 
envelope of these wavelets at a slightly later time is the new primary wavefront. 
He further stated that these secondary wavelets propagate with a speed and 
frequency equal to that of the primary wave. 

This statement suffices to account for the laws of reflection and refraction, and 
the approximately straightline propagation of light through large apertures, but it 
fails to account for diffraction, the deviations from exact straightline propagation 
of light. Fresnel extended Huygens’ principle by assuming that the secondary 
wavelets interfere with one another according to the principle of superposition. 
His statement postulated that each unobstructed point on a wavefront is a source 
of spherical wavelets, and that the amplitude of the wave at any point ahead of the 
wavefront is the superposition of all of these wavelets. In adding these wavelets it 
is necessary to include the amplitude and phase of each wavelet. The Huygens- 
Fresnel principle was put on a firm theoretical basis by Kirchhoff and expressed 
as an integral derived from the wave equation. Details of the derivation of the 
Fresnel-Kirchhoff diffraction integral can be found in Born and Wolf (1980) or 
any intermediate optics text. 
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For our purposes we are interested only in the special case of Fraunhofer 
diffraction, that in which the source of light and the field point of interest are 
effectively at infinity relative to the aperture. In practice this is accomplished in a 
lab by collimating the light from a point source, passing that light through an 
aperture, and observing the diffraction pattern in the focal plane of a lens 
following the diffracting aperture. Thus the image of a star formed by a telescope 
or, equivalently, the converging spherical wave diffracted by the exit pupil of the 
telescope, is a Fraunhofer diffraction pattern. 


10.1.4. FRAUNHOFER DIFFRACTION: RECTANGULAR APERTURE 


The layout for a rectangular aperture is shown in Fig. 10.1, with a spherical 
wavefront W of radius of curvature R emerging from the aperture of sides 2a and 
2b. An arbitrary point Q on the spherical wavefront has coordinates (€, n, ¢), with 
the origin of this coordinate system at the center of the rectangle. An arbitrary 
field point P has coordinates (x, y, z) with the origin O of this system at distance R 
along the ¢ axis. The wave amplitude U at point P is the sum of all amplitude 
contributions from each area dS on the wavefront. For all cases of interest we 
assume the dimensions of the rectangle and the distance of point P from the 
origin of its coordinate system are small compared to R. When these conditions 
are satisfied, the sum of the contributions from each dS is a simple scalar sum 
given by 


U(P) = c| exp [ik(s — R)]dS, (10.1.1) 


w 





Fig. 10.1. Coordinate frames at exit pupil (č, 7, ¢) of rectangular aperture and image surface 
(x, y, z) of optical system. W is a spherical wavefront of radius R centered at O. See Eq. (10.1.1). 
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where k = 27/4, s is the distance from Q to P, C is a constant proportional to the 
amplitude at Q, and the integration is over the unobstructed wavefront. 

As noted in the foregoing, the center of curvature of the wavefront is at point 
O. For this particular point P we have s = R for all points Q on the wavefront, 
hence all waves are in phase at O. Therefore the argument in Eq. (10.1.1) is zero 
and the integral gives the area of the rectangle. Because all waves are in phase at 
O, the amplitude U(O) is a maximum. For any other P the path and phase 
differences are (s — R) and k(s — R), respectively, and the amplitude at P is less 
than that at O. 

Expressing s in terms of the coordinates of Q and P, relative to the origin at O, 
we get 


$=- +- +E- 9, 
= R? — (xë + yy +26), (10.1.2) 
where all squared terms in x, y, and z are negligible. With s nearly equal to R, 


given our assumption about distances in the preceding, we have s? — R? = 
(s — R)(s + R) & 2R(s — R). Substituting this relation into Eq. (10.1.2) gives 





eRe tte 
R 
_ _xé+yn E +9? 
=-75 (i w l (10.1.3) 


At this point we define p = x/R, q = y/R, where p and q are the direction cosines 
of a line from the center of the rectangle to P. Setting z = 0 to limit our analysis 
to the paraxial focal plane, we substitute the remaining terms into Eq. (10.1.1) 
and get 


a b 
UP) =C [ ik exp (—ik(pE + qn) )dE dn 


= cf exp (-ikpé)aé | 


—a 


b 
exp (—ikqn)dn. (10.1.4) 

-b 

Evaluating the integral we get 


U(P) = CA (a) (=) = ca(=*) (=), (10.1.5) 








kpa kgb v v 


x y 
where A = 4ab is the area of the rectangular aperture, and v, and v, are 
dimensionless variables. 

The intensity at point P of an image is the absolute square of the time- 
averaged amplitude of the electromagnetic wave at P, while the point spread 
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function (PSF) at P is the intensity normalized to unity at the point where the 
intensity is a maximum. Denoting the intensity at the center of the image by Jp, 
and the PSF at point P by i(P), we get 





5 . 2 : 2 
w=- P= (2) (=) i (10.1.6) 
0 v, Vy 


where J, = C?A?. 

A two-dimensional (2D) surface plot of i(P) for a square aperture is shown in 
Fig. 10.2, with i(P) > 0.1 removed to enhance the secondary peaks. A semilog 
plot of the function X = (sin v,/ v} is shown in Fig. 10.3. Note that the function 
X is zero when v, = n7, where n is any nonzero integer. Hence the plot in Fig. 
10.3 of X versus v,/n shows the minima at integer values. 

Of particular interest is the minimum with n = 1 adjacent to the principal 
maximum. At this minimum v, = x = (27/4) pa, hence 


Pi =x,/R = å/2a. (10.1.7) 


From Eq. (10.1.7) we see that the linear distance between minima on opposite 
sides of the main peak is 2x,. We also find that the full-width-half-maximum 
(FWHM) of the principal peak is approximately 0.9x,. The corresponding angular 
distances are 2p, and 0.9p,, where the latter is the angle subtended by the FWHM 
at the aperture. 

The importance of Eq. (10.1.7) in angular terms cannot be overemphasized. 
For a point source, the angular FWHM in a given direction is approximately the 
wavelength divided by the width of the aperture in that direction. The larger is the 
width, the smaller is the angular size. The only difference between apertures of 
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Fig. 10.2. Surface plot of i(P) < 0.1 for a square aperture. 
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Fig. 10.3. Slice of i(P) for a square aperture along the x-axis. 


different shape is the numerical factor of order unity that multiples wave- 
length/width. 

The decrease in the PSF in the wings of the diffraction pattern beyond a few 
bright fringes is best described in terms of a locally smoothed PSF. Across a 
fringe the average value of sin? v = 0.5, hence along the x-axis 


1 LANA 
(iP) = 20)? ie (=) we (10.1.8) 





with a similar relation along the y-axis. In the case of a square aperture of side 2a, 
the average PSF along the diagonal of the diffraction pattern x = +y is 


. 1 ae a | 
(i(P)) xy = at Fr (5) re (10.1.9) 


For a circular or annular aperture the decrease in the PSF is proportional to 1/%, 
where « is the field angle. It is an interesting exercise to compare the average PSF 
in the wings of an image for a circular aperture, given in Eq. (10.2.12), with that 
of a square aperture, and consider the possible advantages of using the latter in 
the search for faint stellar companions. 

The enclosed energy EE is defined as the fraction of the total energy E within 
an area of sides (2x, 2y) centered on the PSE, where E is proportional to the 
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integral of i(P) over the entire receiving plane. In terms of dimensionless 
variables we have 
oo 


EE = Ie r i(P)dv, dv,/ {| i(P)dv, dv,. (10.1.10) 


vy J—0 
2 —00 


For a square aperture of side 2a we take a square area of side 2x at the PSF. To 
find EE for the principal maximum we have x = x), Vo = V = 7. Substituting 
for i(P) from Eq. (10.1.6) and evaluating Eq. (10.1.10) at these limits, we find 
EE =0.815 for the principal maximum. Within a square enclosing the FWHM we 
get EE= 0.52. 

The final item of interest for a rectangular aperture is the case where a < b, 
hence a narrow slit in the y direction. In the limiting case where the slit is 
effectively infinite in length, it is appropriate to consider the distant source as a 
line rather than a point and the wavefront W in Fig. 10.1 as cylindrical. As a result 
there is no dependence on y and the resulting diffraction pattern is the x-part only 
of Eq. (10.1.6). The pattern consists of a series of bright and dark fringes parallel 
to the source and slit. 


10.2. PERFECT IMAGE: CIRCULAR APERTURE 


We now apply the Fresnel-Kirchhoff diffraction integral to the case of a 
circular aperture with a central obscuration, as is the usual case for a telescope. 
Our analysis is done for an annular aperture with an obscuration of diameter £D, 
from which results for a clear aperture follow with € set to zero. 

As shown in the previous section, the characteristics of the image of a point 
source object formed by a perfect optical system are completely described by the 
point spread function (PSF) and quantities derivable from the it. One of the 
quantities derived from the PSF is the encircled energy fraction (EE); for a 
circular or annular aperture this is the fraction of the total energy in the image 
within a circle of a given radius centered on the PSF. The intensity in units of flux 
per unit area at a point on the image is directly proportional to the PSF, while the 
average intensity over a centered portion of an image depends on both the PSF 
and EE. Derivation of each of these items in this section parallels that for the 
rectangular aperture. 


10.2.a. POINT SPREAD FUNCTION 


Consider the exit pupil of an optical system with radius a, as shown in Fig. 
10.4, with a spherical wavefront W of radius of curvature R emerging from the 
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pupil. As in Fig. 10.1, an arbitrary point Q on the wavefront in Fig. 10.4 has 
coordinates (č, 4, €) and is a distance pa from the z-axis at angle ọ with the č 
axis. For an annular aperture with central obscuration of radius ea, p is in the 
range € to 1. An arbitrary field point P is a distance r from the z-axis at angle y 
with the x axis. Relative to the origin at O, the coordinates for points Q and P are 


č =apcosg, x=rcosy, 
n = apsing, y=rsiny, (10.2.1) 


(= Va, 


For all systems of interest we assume the distances z, r, and a are small compared 
to R. When these conditions are satisfied, the sum of the contributions from each 
dS on the wavefront is given by Eq. (10.1.1), where the distance s from Ọ to P is 
given by Eq. (10.1.2). 

Substituting Eqs. (10.2.1) into Eq. (10.1.2) gives 


1 2 
s-R=- oos- + 4]1~5 (2) | (10.2.2) 


where ¢ in Eqs. (10.2.1) is transformed by the binomial expansion. Following 
Born and Wolf, we define dimensionless variables u and v in the form 


2n (a? 2n (a 
u=- *) Z t= (=). (10.2.3) 
Substituting Eqs. (10.2.3) into Eq. (10.2.2) gives 
2 2 
k(s — R) = —vp cos (py — Y) — 7 4 (7) (10.2.4) 


The introduction of these dimensionless variables is made for convenience in 
relations involving aberrations to follow in subsequent sections. 





Fig. 10.4. Coordinate frames at exit pupil (€, 7, £) of circular aperture and image surface (x, y, z) 
of optical system. W is a spherical wavefront of radius R centered at O. See Eq. (10.2.1). 
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At this point we set u = 0 and evaluate U(P) in the paraxial focal plane. 
Noting that the area element dS = a?p dp dq, the amplitude in the paraxial focal 
plane is given by substituting Eq. (10.2.4) with u = 0 into Eq. (10.1.1). The result 
is 


2n pl 
U(P) = Ca? | | exp[—ivp cos (g — W)]p dp dọ. (10.2.5) 
0 Je 
To carry out the integration over ~ we note that U (P) is independent of y because 
the system is symmetric about the z axis. Therefore we can choose any convenient 
value for w; we choose y = n. Carrying out the integration over œ involves 
substituting the integral representation of Jọ, the Bessel function of order zero, 
with the result 





1 20 pl 
2na*C 
U(P) = 2na?c | Jo(vp)p dp = = | dlvpJ,(vp)]. (10.2.6) 
€ E 

The second step in Eq. (10.2.6) follows after substituting one of the recurrence 
relations for Bessel functions found in tables of mathematical functions (see, e.g., 
“Tables of Integrals and Other Mathematical Data” by Dwight (1961)). Integra- 
tion of Eq. (10.2.6) gives 


U(P) = nate 0 -eg ee) 
v Ev 


(10.2.7) 
where J; is a Bessel function of order one. The ratio 2/,(w)/w approaches one as 
w approaches zero, hence U(O) = na?C(1 — £). 

Using this result, we write the PSF at point P as 


apy PO 2) 2 (ev) 7? 
Ne Te sral v Oe |. 


where /(P) = |U(P)|* and i(P) = PSF. It is convenient to represent the intensity 
in this form to facilitate comparison of intensity profiles for apertures with 
different central obscurations. As examples, semilog plots of PSFs are shown in 
Fig 10.5 for ¢ = 0, a clear aperture, and ¢ = 0.33, the obscuration of the Hubble 
Space Telescope. Note that the ordinate in Fig. 10.5 is v/x. A 2D surface plot of 
i(P) for a clear aperture is shown in Fig. 10.6, with i(P) > 0.1 removed to 
enhance the rings around the main peak. 

The PSF given by Eq. (10.2.8) and shown in Figs. 10.5 and 10.6 is in the 
paraxial focal plane and characterizes the so-called Airy pattern. The intensity is a 
maximum at the paraxial image point at v = 0, and the pattern is a central bright 
disk, the Airy disk, surrounded by concentric bright and dark rings. For a clear 
aperture the peak intensity of successive bright rings decreases monotonically as 
v increases; for an obstructed aperture the intensity of successive bright rings 
decreases in a cyclic manner, depending on the specific value of e. 





(10.2.8) 
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Fig. 10.5. Point spread function of perfect image for obscuration ratios ¢ = 0 (solid line) and 
e = 0.33 (dashed line). 
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Fig. 10.6. Surface plot of i(P) < 0.1 for a circular aperture with £ = 0. 
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The radii of the dark rings in the Airy pattern are found by setting i(P) = 0. 
Radii for the first three dark rings are given in Table 10.1 for several values of e. 
Note that the radius of the first dark ring, which encloses the Airy disk, decreases 
as £ increases, while the radius of the second dark ring is a maximum near ¢ = 0.3 
and decreases for larger obscurations. 

One important descriptor of the Airy pattern is the radius of the first dark ring. 
Substituting v; from Table 10.1 with £ = 0 into Eqs. (10.2.3) gives 


r= 1224F, q= F = 1227, (10.2.9) 


where r; and a, are the linear and angular radii, respectively, of the Airy disk, f is 
the system focal length, D is the diameter of the entrance pupil, and F is the focal 
ratio. Substituting f /D for R/2a assumes the point source object is effectively at 
infinity, and replaces exit pupil distances with those of the entrance pupil. This 
substitution also defines the angular radius of the first dark ring as an angle 
projected on the sky. For other values of £, the factor 1.22 in Eq. (10.2.9) is 
replaced by the corresponding numerical factor in the w, column in Table 10.1. 

For a distant point source, the variable v is related to the system parameters and 
a dimensionless radius w by the relations 


nr = 7nDa 
v=WR= T=: (10.2.10) 


where r and « are linear and angular radii, respectively. The radii w for the first 


and second dark rings, and the radius at which i(P) = 0.5 are shown in Fig. 10.7 
for a range of £. 


Table 10.1 


Radii of Dark Rings in Airy Pattern?” 


€ wi wz w3 


0.00 1.220 2.233 3.238 
0.10 1.205 2.269 3.182 
0.20 1.167 2.357 3.087 
0.33 1.098 2.424 3.137 
0.40 1.058 2.388 3.300 
0.50 1.000 2.286 3.491 
0.60 0.947 2.170 3.389 


* Subscript on w is the number of the 
dark ring starting at the innermost 
ring. 
b 

w= y/n. 
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Fig. 10.7. Dimensionless radii at first two dark rings and one-half of peak intensity as a function 
of obscuration ratio. Linear radius = wAF; angular radius = w2/D. 


10.2.b. AVERAGE PSF IN AIRY WINGS 


The description of the PSF in the wings of the Airy pattern beyond a few 
bright rings is derived by using the following asymptotic relation for the function 
J,,: 


n 


2\ 12 NT T 
146) = () cos(#—-—). (10.2.11) 


This approximation is good to 1% or better for B > 15. Choosing n = 1 and 
B =v or év, as appropriate, Eq. (10.2.8) becomes 


3 2 
i(P) = ae (=) f(e- 7) — Vecos(+ -=)| ; 
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Squaring and expanding the factor in brackets, we find a locally smoothed PSF by 
setting each of the cos? terms to one-half and the cross term to zero, with the 
result 


(i(P)) = (10.2.12) 





4+e) 1 4(1+28) (5): 


ml —e) (l — 22)? \D) 03? 


where « is the field angle in radians. The quantity (i(P)) is a good measure of the 
average intensity over one or two Airy rings in the range where the asymptotic 
relation for J, is a good approximation. For € = 0.33, for example, Eq. (10.2.12) 
is valid beyond the tenth bright ring. 

From Eq. (10.2.12) we see that the average intensity in the wings of the Airy 
pattern is larger for larger values of e. It is apparent that the effect of the central 
obscuration is to transfer some of the energy from the disk and nearest bright 
rings into the wings. A quantitative measure for the fraction of the energy in the 
wings of the Airy pattern is developed in the following section. 


10.2.c. ENCIRCLED ENERGY 


The encircled energy EE is defined as the fraction of the total energy E in the 
image enclosed within a circle of radius r centered on the PSF peak. Following 
Born and Wolf we have 





2n fro 
EE =5| | I(P)r dr dw (10.2.13) 
E 0 J0 
l-g f” 
— | I(P)v dv, (10.2.14) 
2lo Jo 


where vọ is a dimensionless radius, and /(P) is given by Eq. (10.2.8). The 
transformation of Eq. (10.2.13) into Eq. (10.2.14) follows from substitution of v 
for r using Eq. (10.2.3), and substitution of J) for E according to 


EA nE(1 — 2) 


2 42 AA 
OE ae AF 





(10.2.15) 


where C is the constant in Eq. (10.1.1), A = ma*(1 — £?) is the area of the annular 
aperture, f is the focal length, and F is the focal ratio. Further discussion of Eq. 
(10.2.15) follows in the next section. 
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We first evaluate Eq. (10.2.14) for the clear aperture, and find 


vo 2 
EE(vo) = af =e) v dv 


0 
= ik dO) +O) 
= 1 — Je(v) — J? (vo). (10.2.16) 


The intermediate step in Eq. (10.2.16) follows after substitution of a recurrence 
relation. When vg is taken at a dark ring, Jı (vọ) is zero and the fraction of the 
energy outside the dark ring is given by Jê (vo). 

Following the same procedure for the obstructed aperture, Eq. (10.2.14) 
becomes 


EBCo) = ga 1 = Ieo) = Ilea) + 2C ~ Eleno) = Hero) 


—2e l Ji(ev) 2O a). (10.2.17) 
0 v 


Results derived from these relations for EE are shown in Fig. 10.8 for = 0 and 
e = 0.33. Encircled energy values within each of the first three dark rings are 
given in Table 10.2, with data for the first two dark rings plotted in Fig. 10.9. Also 
shown in Fig. 10.9 are EEs within the radii at which the intensity is one-half of 
the peak. 

Examination of the results in Table 10.2 and Fig. 10.9 shows that there is a 
significant transfer of energy from the Airy disk to the first bright ring with 
increasing ¢. We also see that EE in the disk and first bright ring combined 
decreases very slowly as € increases from zero to 0.35. From Fig. 10.9 it is also 
evident that noticeable energy transfer to the second bright ring begins when ¢ is 
approximately equal to 0.4. 

Returning to Eq. (10.2.17), we note that the value of the integral in this relation 
is equal to e + ô for vp > 1, with 6 < €. As an example, with € = 0.33 we find 
ô < 0.01 for vy > 15. Therefore a good approximation to EE for large vg is found 
by substituting e for the integral in Eq. (10.2.17), and Eq. (10.2.11) for Jọ and Jj. 

Combining the terms involving the Bessel functions, we find 


JoB) +J?) = 2/8, 
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v/T 


Fig. 10.8. Encircled energy fraction for obscuration ratios ¢ = 0 (solid line) and € = 0.33 (dashed 
line), for perfect image. 


Table 10.2 
Encircled Energy Fraction within Airy 
Dark Rings? 
E EE, EE, EE, 


0.00 0.838 0.910 0.938 
0.10 0.818 0.906 0.925 
0.20 0.764 0.900 0.908 
0.33 0.654 0.898 0.904 
0.40 0.584 0.885 0.903 
0.50 0.479 0.829 0.901 
0.60 0.372 0.717 0.873 


“Subscript on EE is number of dark 
ring starting at innermost ring. 
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Fig. 10.9. Encircled energy fraction within first and second dark rings and within one-half of peak 
intensity. Results are given as a function of obscuration ratio. 


where f is either vg or evo. Therefore 
2 =1 2A 
m(1 — Evo m(1 — £)Da (10.2.18) 


EE(v > 1) =1- 


where OE is the fraction of the energy outside radius vg. Examination of Eq. 
(10.2.18) shows that the larger e, the larger is the fraction of the energy outside a 
given large radius. 


10.2.d. IRRADIANCE AND INTENSITY 


The PSF defined in Section 10.2.a is a dimensionless measure of the intensity 
or irradiance of the Airy pattern, but it is also necessary to give physical units to 
the PSF. In this section we give relations for the irradiance and intensity at the 
center of the Airy pattern and discuss the average irradiance over the Airy disk. 
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The terms intensity and irradiance are often interchanged in usage. The 
definition of irradiance is the energy incident on a surface per unit area per 
unit time, with units watts/m* (W/m7). Physicists commonly use the word 
intensity for the flow of energy per unit area per unit time through a surface. 
Astronomers, however, generally follow the definition that intensity is the energy 
per unit time in a certain direction per unit solid angle, with units watts/steradian 
(W/sr). We follow this latter usage of intensity. The symbol J is often used to 
represent both intensity and irradiance, and we follow this convention. The 
interested reader should consult the reference by Mahajan (1998) for a thorough 
discussion of the radiometry of imaging. 

The relation given in Eq. (10.2.15) is derived by Born and Wolf (1980). 
Replacing the energy £ in Eq. (10.2.15) by the energy per unit time, or flux F, 
we have the irradiance J, at the peak of the PSF as 


_FrD(1—&)  nF(1— e) 
oT ga? f2 RRP 





(10.2.19) 


It is instructive to compute the peak irradiance for a specific case. Consider a 
perfect Hubble Space Telescope (HST) with D = 2.4 m, F = 24, e = 0.33, and 
area A = 4.03 m?. Taking the canonical value for the photon flux as 1E4 
photons/(sec cm? nm) for a zero-magnitude star at 4=550nm, we get 
F =4.03E8 photons/(sec nm) = 1.46E-10 W/nm, for the photon and energy 
flux per nm in the image of the HST with unit transmittance. Substituting the 
given values into Eq. (10.2.19), we find J) = 1.63E18 photons/(sec m? nm )= 
0.59 W/(m? nm) =5.9E-13 W/(um? nm), for a zero-magnitude star. 

Another quantity of interest is the average irradiance over the Airy disk, that 
part of the image enclosed by the first dark ring. The average irradiance (/(disk)) 
is the flux in the disk divided by its area, and is given by 





TET eee af 


= 10.2.20 
nr? my2(1.22AF)* ( ) 


where ø is the fraction of the total flux in the Airy disk, z, is the radius of the Airy 
disk, and y is a numerical factor such that 1.22y = w, from Table 10.1 for an 
annular aperture. The value of o also depends on e, as noted in the discussion of 
encircled energy in the previous section. 

Dividing Eq. (10.2.20) by Eq. (10.2.19), we get 


(/(disk)) 7 4c 7 o 
h (2r -) Darg y2(1 — £) 





(10.2.21) 


Taking ¢ = 0 and € = 0.33, we use the results in Table 10.1 and find y = 1 and 
y = 0.9, respectively, and from Table 10.2 we get o = 0.838 and o = 0.654, 
respectively. Putting these values into Eq. (10.2.21) gives (/(disk)) /% = 0.228 
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and 0.246 for the apertures with € = 0 and ¢ = 0.33, respectively, hence a smaller 
Airy disk roughly compensates for the smaller encircled energy fraction. 

For another example assume a passband of 100 nm centered at 2 = 550 nm, 
and a star of apparent magnitude 25 imaged by HST. In this case we find a photon 
flux of 4.03 photons/sec passing through the HST aperture and 2.63 photons/sec 
on the Airy disk. Assuming a reflectance of 0.9 for the primary and secondary 
HST mirrors, a more accurate flux value is approximately 2.1 photons/sec for a 
star of apparent magnitude 25 at the f/24 focus of HST. The detected photon 
flux, of course, depends on the efficiency of the optics and detector in a reimaging 
camera. 

As a final item we note that Born and Wolf also define J) = EA/J”, hence units 
are those of intensity rather than irradiance. In this case the integral in Eq. 
(10.2.13) is over the solid angle subtended by the image at the aperture instead of 
the area of the image. 


10.2.e. RESOLUTION LIMIT 


A telescope is often used at or near its angular limit of resolution, the 
minimum angular separation between two point sources of approximately equal 
brightness which can be seen as two separate images, or just resolved. Following 
the criterion first put forth by Lord Rayleigh, we say two stars of equal brightness 
are just resolved when the peak of one Airy disk falls on the first dark ring of the 
other Airy disk. Therefore the angular limit of resolution is 


(A®) min = 1.22y4/D, (10.2.22) 


min 
where 1.22y = w, from Table 10.1 to account for the decreasing diameter of the 
Airy disk with increasing obscuration. 

At the point midway between the PSF peaks, the normalized intensity of the 
sum is 2i(w,/2) and ranges from about 0.74 at € = 0 to 0.81 at € = 0.33. A 
detector with several pixels spanning an Airy disk will easily resolve the separate 
images in this case and the condition of “just resolved” is somewhat smaller than 
given in Eq. (10.2.22). The actual limit of resolution in practice depends on the 
brightness ratio of the stars and the characteristics of the detector. The convention 
adopted for convenience, however, does not consider these details and Eq. 
(10.2.22) gives the accepted limit. 


10.3. THE NEAR PERFECT IMAGE 


An image is perfect if the wavefront emerging from the exit pupil is spherical; 
if there are any deviations of the wavefront from a sphere the result is a less-than- 
perfect image. These wavefront deviations may be due to the presence of 
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geometric aberrations of the type discussed in Chapter 5, but also arise from 
random variations in optical surface quality as a result of the polishing process. 
Each of these wavefront deviations is characterized by a different scale at the exit 
pupil. Geometric aberrations vary slowly across the aperture and are specified in 
functional form, while random variations occur on a much shorter scale and are 
usually treated with statistical models. 

Wavefront errors may also arise if the shape or orientation of the wavefront 
changes with time, where such time-dependent errors may be regular or random, 
and on a slow or fast time-scale. An example of a slow, regular time-dependent 
error is the change in focus of a Cassegrain telescope due to temperature changes. 
The error could be eliminated by periodically adjusting the secondary mirror and 
refocusing the telescope. Correction of slow time-dependent errors comes under 
the heading of active optics. An example of a rapid, random time-dependent error 
is the oscillation of an image centroid about its mean position due to atmospheric 
effects. Correction of rapid fluctuations in the shape of a wavefront is done with 
adaptive optics. These latter types of errors are also best treated with a statistical 
approach, with an introduction to this approach given in the next chapter. 

In this section we consider geometric aberrations and their effects on image 
quality. Our discussion is only an introduction to a large subject matter, and the 
interested reader should consult some of the references listed at the end of the 
chapter for more extensive discussions. 


10.3.4. DIFFRACTION INTEGRAL WITH ABERRATIONS 


A cross section of a wavefront with aberrations and the reference sphere are 
shown in Fig. 5.3, where A, as given in Eq. (5.3.1), is the geometrical path 
difference between the wavefront and reference sphere. In the notation of Fig. 
10.1 the center of curvature O of the reference sphere is the location of the 
Gaussian image for a perfect system. The coordinate systems used to locate 
points on the wavefront and near the image are given in Eq. (10.2.1) for a circular 
aperture. 

To include aberrations in the diffraction integral given in Eq. (10.1.1), we 
substitute (s — R + ®) for (s — R), where © is the optical path difference between 
the aberrated wavefront and reference sphere. If we consider only third-order 
aberrations, then from Eq. (5.5.1) we get 


O=Byt By + Bir + Bye +y)+ B07 +y). (10.3.1) 


We choose, as in Chapter 5, to describe the astigmatism at the sagittal image, 
hence Bj = 0. To make the notation in ® consistent with that used in this chapter 
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for a circular aperture, we replace x and y in Eq. (10.3.1) by € and y, respectively, 
using Eq. (10.2.1). The result is 


® = Baap sin o + B, p’ sin? ọ + B,a°p° sing + Bza p* 
= Mapsin o + anp? sin? o + azp? sin o + asp"), (10.3.2) 


where the a coefficients include the radius of the exit pupil and are dimensionless. 
Note that the dimensions of ® are included in the wavelength å in Eq. (10.3.2). 
The factors in Eq. (10.3.2) represent, in turn, distortion, astigmatism, coma, and 
spherical aberration, as follows: 


Aai = Boa, dan = B,a’, 2a31 = Boa’, Aagg = Ba’, (10.3.3) 


with each corresponding a coefficient giving the amount of aberration in units of 
waves. 

Note that the subscripts on the a coefficients are changed from those in the 
previous line with the first subscript the power of p and the second the power of 
sin o. This is done to bring our notation in line with that commonly used as, for 
example, by Mahajan (1998). It is also important to note that Eq. (10.3.2) is the 
optical path difference in a simplified case, that in which the incident chief ray is 
in the yz plane, as shown in Fig. 5.1. If we had chosen the xz plane instead, Eq. 
(10.3.2) would have cos ọ rather than sin ¢. 

The diffraction integral for a circular exit pupil, including aberrations, is found 
by substituting the sum of Eq. (10.2.4) and k times Eq. (10.3.2) for k(s — R) in 
Eq. (10.1.1). The result is 


2n pl 
U(P) = Ca? | | exp [i(k® — vp cos (p — y) — up? /D]p dp dọ. (10.3.4) 
0 Je 
Note that the term u(R/a} in Eq. (10.2.4) is not included in Eq. (10.3.4). This 
term does not depend on the variables of integration, is removed from the 
integral, and does not appear in |U(P)|?. 

A complete analysis of Eq. (10.3.4) is beyond the scope of our treatment. For 
such an analysis the interested reader should consult the references by Born and 
Wolf (1980), Mahajan (1991, 1998), and Wetherell (1980) given at the end of this 
chapter. We do present selected results after discussing the effect of aberrations 
on peak intensity. 


10.3.b. PEAK INTENSITY AND AVERAGE WAVEFRONT ERROR 


Before discussing specific aberrations, it is important to show the relation 
between the peak intensity and the average wavefront error. We take point P at the 
center of the reference sphere, hence u = v = 0, and assume the aberrations are 
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small. Given that i(P) = |U(P)|? is normalized to unity for a perfect image, we 
find 

27 pl 2 
| | exp (ik®)p dp dp 


0 Je 





i(0) = TA 





2 


27 pl 
=a|f | t+ iko + Go) /2-+--Ip dp do|, (10.3.5) 
0 Je 








where Co = 1/n(1 — £?). We now define (®”) as the average of the nth power of 
®, where 


2n pl 

(D") = c| | ©" p dp dọ. (10.3.6) 
0 Je 

Neglecting all factors in k® higher than second power in Eq. (10.3.5), we can 

write the approximate intensity at the center of the reference sphere as 


(0) = |1 + ik(®) — k? (0?) /2}? 
= | — KPO?) — (®)7] = 1 — ko?, (10.3.7) 


where i’ is used to indicate that this is an approximation to 7. The parameter w is 
the root-mean-square (rms) or rms wavefront error given by 


w = [(@?) — (@)7]'”, (10.3.8) 


The rms wavefront error is a useful parameter for characterizing a high-quality 
optical system because its value can be calculated once the type and magnitude of 
aberrations are known. We see from Eq. (10.3.7) that the normalized intensity at 
the location of the nominal focal point is independent of the type of aberration, 
with the decrease from unity proportional to w* in this approximation. 

In the presence of aberrations, the normalized intensity i’(0) is often used as 
one measure of image quality. This normalized intensity, by convention, is called 
the Strehl intensity or Strehl ratio. A common convention is to consider a system 
as diffraction-limited if the Strehl ratio is greater than or equal to 0.8. Given this 
convention we find that w = 0.07124 = 4/14 for a system that is just diffraction- 
limited. 

The Strehl ratio given by Eq. (10.3.7) is an approximation to the normalized 
peak intensity valid for small œw. It was shown by Mahajan (1983) that a better 
approximation for the Strehl ratio S is given by 


S =i'(0) = exp (-K’a”). (10.3.9) 


A comparison between i(0) calculated directly from Eq. (10.3.5) with S from Eq. 
(10.3.9) shows that the latter agrees with the former with an error of less than 
10% for S greater than 0.3. This limit corresponds approximately to œ = 4/5.7. 
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Table 10.3 


Classical Aberrations and RMS Wavefront Errors?” 


Aberration RMS Wavefront Error 
% lazo! 2 4 6 831/2 
Spherical ayy p* 0 (4 — e? — 624 — eÉ + 468)! 
Pp '40P 3 WAA ) 
: lazı] E EEN 
Coma a; p? sin Zilte the 
31P E V8 ( ) 
Asigmatism anp? sin? o Baal ¢ + yl? 
lazo 2 
Defocus anp? (1 -e 
20P 5 V3! ) 
i g ‘ lay,| 21/2 
Distortion ap sino = fl +e) 


“RMS error is given in units of wavelength. For linear measure, 
multiply by the wavelength. 

> Error for astigmatism is given at sagittal focus; other errors given at 
paraxial focus. 


10.3.c. CLASSICAL ABERRATIONS AND WAVEFRONT ERROR 


The classical third-order aberrations are those given in Eq. (10.3.2). Substitut- 
ing each in turn into Eq. (10.3.6), it is a straightforward matter to calculate the rms 
wavefront error for each, with the results given in Table 10.3. These expressions 
for œ are appropriate for the specific image locations used to derive the aberration 
coefficients in Chapter 5: at the nominal paraxial or Gaussian focus for spherical 
aberration, coma, and distortion, and at the sagittal image for astigmatism. Note 
an additional aberration in Table 10.3, that of pure defocus. 

Numerical calculations of i(P) using Eq. (10.3.4) including focus shift show 
that the Strehl ratio is not a maximum at these image locations. In the presence of 
spherical aberration only, for example, the “best” image is not at the paraxial 
focus but between the paraxial and marginal foci, as a glance at Fig. 4.5 shows. 
The distance between these two foci is 2F TSA, where F = s’/2a and TSA is 
given in Eqs. (5.5.9). After substituting for B, from Eqs. (10.3.3) we get the 
separation between the paraxial and marginal foci as 16Aa4) F°. 

Calculations with ¢ = 0 show that i(0) is largest at a point half way between 
these two foci and the corresponding rms wavefront error is a minimum at this 
point. The dependence of œ and peak intensity on focus shift for ¢ = 0 is shown 
in Fig. 10.10; in this case the value of œ at the paraxial focus is 4 times larger than 
at the point where the peak intensity is a maximum. 
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Fig. 10.10. Normalized rms wavefront error œ and Strehl ratio S for image with spherical 
aberration as a function of image surface location. S is given for w = 1, œ = 0.0754 at 7 = 1. The 
normalized focus shift is 2’. P, paraxial focus; M, marginal focus; C, circle of least confusion. 


Following a similar procedure, it turns out that a system with astigmatism has 
a minimum w and maximum i(0) at a point half way between the sagittal and 
tangential line images for any value of e. The distance between the line images is 
2F TAS, where TAS is given in Eqs. (5.5.9). After substituting B, from Eqs. 
(10.3.3) we find the separation between the line images as 84an F°. The 
dependence of w and peak intensity on focus shift is shown in Fig. 10.11 for 
an astigmatic image. For a clear aperture the value of œ at either line image is 
about 20% larger than at the midway point. 

For both spherical aberration and astigmatism, the point of maximum i(0) is on 
the axis defined by v = 0. For coma and distortion i(0) is a maximum for a point 
displaced transversely from the paraxial image point and v is not zero. For a 
single aberration, the point at which the peak intensity is a maximum is called the 
diffraction focus. Table 10.4 gives the shifts from the foci specified in Table 10.3 
to the diffraction focus for each of the classical aberrations. In the following 
section we outline the procedure by which these shifts are calculated. 
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L=line image 
B=blur circle 





0.0 0.5 1.0 
z’ 
Fig. 10.11. Normalized rms wavefront error œ and Strehl ratio S for astigmatic image as a 


function of image surface location, with 7 = 0, | at line images. S is given for w = 1, w = 0.0754 at 
blur circle B. The normalized focus shift is z’. 


Table 10.4 


Coordinate Shifts to Diffraction Focus? 





Along 
Aberration y-axis z-axis 
Spherical 0 Bagg A(1 + &°)F? 
4a,,AF (1+2 +e 
Coma 31 EREE 0 
3 1+ 
Astigmatism 0 4an âF? 
Defocus 0 Sar) AF? 
Distortion (tilt) 2a;2F 0 


“For starting point of shift for each aberration, see Table 10.3. 
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10.3.4. ORTHOGONAL ABERRATIONS AND ZERNIKE POLYNOMIALS 


As seen from Table 10.4, the location of the diffraction focus depends on the 
type and magnitude of aberration present. Because of this dependence, it is 
appropriate to restructure the classical aberration terms and include explicitly the 
required image shift to place the diffraction focus at u = v = 0. These modified 
terms are called orthogonal aberrations, with the polynomials in p and @ called 
Zernike polynomials. A list of Zernike polynomials needed for third-order 
aberrations of an unobstructed circular aperture is given in Table 10.5. Note 
the presence of both sin and cos factors in Table 10.5, hence the representation of 
coma and astigmatism of arbitrary orientation in an xy coordinate frame is 
possible. Table 10.6 lists the third-order orthogonal aberration terms for an 
annular aperture, along with expressions for the rms wavefront errors at the 
diffraction focus. For a detailed discussion of the properties of the orthogonal 
aberrations, including derivations, consult the references by both Mahajan and 
Born and Wolf. 

The importance of representing the total aberration of a system as the sum of 
orthogonal aberrations is that each term in the sum is optimally chosen to give a 
minimum rms error over the exit pupil. In addition, the mean square error (MSE) 
of the total aberration is the sum of the MSE of the individual orthogonal 
aberrations. Thus it is straightforward to find the overall rms wavefront error once 
the separate a,,,, in Eq. (10.3.3) are known. 

Choosing an orthogonal aberration in an optimal way is done by adding one or 
more classical aberrations. We note, for example, that the spherical aberration 
terms in Tables 10.5 and 10.6 show a term in P, a focus shift term, added to that 
of p*. A focus shift term is also evident in the entry for astigmatism in Table 10.6. 
In the case of coma, the added term is proportional to p sin œ, which is effectively 
a tilt. We also see constant terms in the entries for spherical aberration and focus 


Tabie 10.5 


Zernike Polynomials for Circular Aperture 





Term Z(p, o) Descriptor 

0 1 constant 

1 pcos p x-tilt 

2 psing y-tilt 

3 2p? — 1 focus shift 

4 p? cos2¢ x- or y-astigmatism 
5 pP sin2@ 45°-astigmatism 

6 Gp? — 2)p cos o x-coma 

7 (Gp? — 2)p sin o y-coma 

8 6p -6p +1 spherical 
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Table 10.6 


Orthogonal Aberrations and RMS Wavefront Errors? 











Orthogonal Aberration RMS Wavefront Error 
Spherical: 
I la4ol 
4 2) 2 24 4 40 
a —(l+e +-(1+4e+6¢ 1— 
40 [ (1 +6")? +2 | egle 
Coma: 
(1 +e +et : jaz,|(1 — (1 + 4e? + £t 12 
onfe - Ue A iy TO tessa 
3(1 +e) 6/2 (1 +6?) 
Astigmatism: 
A 1 lanl 
2 2 22 1/2 
sinf ọ — = l+ete 
an [ ( e 5) 5 2/6 | 2) 
Defocus: 
1 lazoI 
2 2 20 
a —-(l +e 1- 
20 [v 3 ( ] 2 2/3! &) 
Distortion: ap sing a | Ha +e)!? 


“Each expression is given in units of wavelength. For linear measure, 
multiply by wavelength. 


in Tables 10.5 and 10.6. These constant terms are chosen to make the average 
wavefront error (©) = 0 for these aberrations, without changing the rms error. 
The proof that adding a constant term to ® does not change the rms error is left as 
an exercise for the reader. 

We now outline the procedure by which an orthogonal aberration is 
constructed, taking as an example spherical aberration for a clear aperture. The 
starting point is to write the wavefront error as classical spherical plus a variable 
focus shift, 


® = hay lpt — ap’). (10.3.10) 
Substituting Eq. (10.3.10) into Eq. (10.3.6) with € = 0 gives 


1 1 2 
(D) = ago G = 5), (0?) = (Aa) G = 5 + 5). 


a ed 
= (Aaso) (5 6’ 12) 


Setting the derivative of œ? with respect to œ equal to zero and solving for « gives 
æ = 1, = dag (p* — p°), and w*(min) = (Aayg)’/180. The relation between « 
and the linear focus shift is found by noting that & times the term in « in Eq. 


(10.3.11) 
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(10.3.10) equals —up*/2 in Eq. (10.3.4), hence « = Az/(8Aag9F”) where Az is 
the shift from paraxial focus. The constant term added to ® is —(®) with « = 1. 
This general procedure can be used to verify the entries in Tables 10.4 and 10.6, 
starting with the classical aberrations in Table 10.3. 


10.3.e. EXAMPLES 


As illustrations of the effects of aberrations on the PSF, we take two examples: 
a perfect image subject to defocus and an image with spherical aberration at the 
diffraction focus. The results were obtained by numerical integration of Eq. 
(10.3.4) and apply to an unobstructed aperture. 

Figure 10.12 shows image profiles for the disk and the first two bright rings of 
an image with different amounts of defocus. Note that the ring structure, clearly 
visible for ay) = 0.25 or œw & 4/14, is essentially absent when az) > 0.5. The 
effect of defocus is clearly one of transferring energy from the disk to the nearby 
rings and filling in the dark rings. Though not shown in Fig. 10.12, the intensity 
i(0) = 0 when a = 1. In general, the peak intensity is zero for an image with 
pure defocus when |a9| = 1/(1 — °). Surface plots of defocused PSFs are 
shown in Figs. 10.13 and 10.14 for ay) = 0.25 and 0.75, respectively. 

Figure 10.15 shows image profiles for an image at the diffraction focus with 
different amounts of spherical aberration. In this case the separate rings remain 
relatively well-defined, but the energy within them grows at the expense of the 





vin 


Fig. 10.12. Point spread function of perfect image with defocus. The aperture is unobstructed and 
the shift from diffraction focus = 8a,)AF*. See Fig. 10.5 for PSF of perfect image. 
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Fig. 10.13. Surface plot of perfect image with defocus; aj) = 0.25. 
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Fig. 10.14. Surface plot of perfect image with defocus; ay) = 0.75. 
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vin 


Fig. 10.15. Point spread function of image with spherical aberration at diffraction focus. The 
aperture is unobstructed. See Fig. 10.5 for PSF of perfect image. 
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Fig. 10.16. Surface plot of spherically aberrant image at diffraction focus; a4) = 1. 
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Fig. 10.17. Surface plot of spherically aberrant image at diffraction focus; a4 = 3. 
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Table 10.7 


EE and @ for Images with Aberrations 








229 


0.25 
0.50 
0.75 
1.00 


Fig. 10.12 
EE, 


0.733 
0.490 
0.248 
0.105 


0.072 
0.144 
0.217 
0.289 


a40 


1.0 
2.0 
3.0 
4.0 


Fig. 10.15 
EE, 


0.668 
0.324 
0.094 
0.068 


0.075 
0.149 
0.224 
0.298 
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disk. Surface plots of spherically aberrant PSFs are shown in Figs. 10.16 and 
10.17 for aş = 1 and 3, respectively. 

The rms wavefront errors and encircled energy in the Airy disk for the profiles 
in Figs. 10.12 and 10.15 are given in Table 10.7. The results for EE, were 
obtained by numerical integration of Eq. (10.2.13). It is evident from these entries 
that encircled energy fraction within the Airy disk drops dramatically with 
increasing rms wavefront error. 
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10.4. COMPARISON: GEOMETRIC ABERRATIONS 
AND THE DIFFRACTION LIMIT 


It is important to compare aberrations as computed from geometric optics, as 
done in Chapter 5, with those found using diffraction optics, as done in this 
chapter. We make this comparison in terms of angular aberrations because these 
are especially significant for telescopes. Our discussion is intended to give the 
reader an idea of when geometric aberration calculations are sufficient and when 
it is necessary to use diffraction theory for accurate results. 

The relations for geometric angular aberrations are given in Table 10.8 in 
terms of the two sets of coefficients used to characterize aberrations. Coefficients 
for two-mirror telescopes found in Tables 6.5 and 6.6, for example, can be 
substituted for B,_, in Table 10.8 to give the corresponding a„m for the classical 
aberrations. These values of a,,,, can, in turn, be used to find the rms wavefront 
errors from Table 10.3. 

As an example of this procedure we take the Ritchey-Chretien design for the 
Hubble Space Telescope. The principal aberration for nonzero field angles is 
astigmatism, with the angular astigmatism according to geometric theory given in 
Table 6.9. Substituting the values of m and f in Table 11.2 into AAS in Table 6.9 
gives AAS = '6"/2F = 2B,a, where T = 8.609, B, is the astigmatism coeffi- 
cient, 0 is the field angle, and a is the radius of the aperture stop. 

From Eq. (10.3.3) or Table 10.8 we get Aa, = B,a’, hence ay, = l'O D/84F. 
We now find the rms wavefront error at the diffraction focus of the astigmatic 
image by substituting € = 0.33 and a», into œw for astigmatism in Table 10.6. The 
result, with 0 expressed in arc-minutes, is 


Wyst(um) = 0.001976? (arc-min), 


3 (10.4.1) 
Myst (waves) = 0.001970° (arc-min)/A(um). 


Choosing w < 1/14, we find from Eq. (10.4.1) that HST is diffraction-limited for 
0 < 4.8 arc-min at 633nm with smaller @ at shorter wavelengths. Thus, for 
example, an instrument aperture at 3.6 arc-min off-axis is illuminated by images 


Table 10.8 


Geometric Angular Aberrations 


Spherical ASA = 4B,a? = 4/ajy/a 
= angular diameter at diffraction focus 
Coma ATC = 3B,a* = 34a3,/a 


= angular length of coma flare 
Astigmatism AAS = 2B a = 24an, /a 
= angular diameter at diffraction focus 


10.5. Diffraction Integrals and Fourier Theory 271 


that are diffraction-limited for visible and near ultraviolet wavelengths. For an 
aperture at larger field angles, on the other hand, the residual astigmatism of the 
HST must be corrected by the optics following the aperture. 

If we approach the angular size of an image from the point of view of 
diffraction theory, then an image that is diffraction-limited has an approximate 
diameter for the Airy disk of 2.442/D or 1.22A/a for a clear aperture. Contrary to 
predictions from geometric optics, an image cannot be smaller than that given by 
diffraction theory. It is instructive to take the top four entries for wavefront error 
in Table 10.3, set ¢ = 0, equate each to w/A = 1/14, and solve for a,,,,. The result 
is |a,,| 0.25 for each of these coefficients, hence the maximum optical 
difference ® ~ 1/4 for each aberration. This corresponds to the result given in 
Section 4.2 and is often called Rayleigh’ quarter-wavelength criterion for the 
amount of aberration that is tolerable in an imaging system. 

If we apply the same procedure to the orthogonal aberrations in Table 10.6, 
with € set to zero, we get |a49| + 0.96, |a3;| + 0.60, |a| = 0.35, and 
la| + 0.25. Not surprisingly, the balancing of a classical aberration with a 
focus shift for spherical aberration and astigmatism, and a tilt for coma, gives a 
somewhat larger tolerance on the corresponding coefficients. 

We now compare the size of the Airy disk with a geometrical image whose 
size is computed using the tolerance on a4g. Substituting agg = 0.25 into ASA in 
Table 10.8 we get ASA = A/a = 2//D, and the geometric blur size is comparable 
to the diameter of the Airy disk. For values of ay) comparable to the tolerance 
limit or smaller, diffraction calculations are necessary, while for substantially 
larger values of ayo, the geometric blur size is an accurate measure of the image 
size. 

It should be evident, therefore, that diffraction theory is required when the 
aberrations are small and the separate rms wavefront errors are comparable to the 
diffraction limit. If any one aberration has an rms error substantially larger than 
A/14, then geometric aberration analysis is adequate. Fortunately, ray-tracing 
programs can easily do both types of calculations, thus facilitating the choice of 
the theory appropriate for the task. 


10.5. DIFFRACTION INTEGRALS AND FOURIER THEORY 


The starting point for our discussion of diffraction, including aberrations, is 
Eq. (10.1.1). We applied this to rectangular and annular apertures, and derived 
expressions in closed form for the point spread function and encircled energy 
fraction for aberration-free images. There is, of course, no limit placed on the 
shape of the aperture, for example, a spider structure supporting a secondary 
mirror in a Cassegrain telescope added to an annular aperture. With such an 
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addition Eq. (10.1.1) is still easily solved in closed form for a perfect image, as 
illustrated in Eq. (10.5.1) to follow. The solution of the diffraction integral for 
more complicated apertures is best done using the formalism of Fourier theory, a 
subject area we introduce here and discuss briefly. 


10.5.4. APERTURE FUNCTION 


Writing Eq. (10.1.1) with explicit reference to the coordinates of the aperture 
and image plane we get 


00 


UES c| ACE, n) exp [ik(pë + qn)|dé dn, (10.5.1) 


—00 


where p and q are functions of x and y, respectively, and 4(č, n) is the amplitude 
distribution in the aperture or aperture function. Aberrations can be incorporated 
by including a term of the form exp (ik®) in A(é, 4). Although the limits of 
integration extend over an infinite plane, A(¢, 7) is nonzero only over the aperture. 

With the diffraction integral rewritten as Eq. (10.5.1), we have an integral in 
the form of a 2D Fourier integral or Fourier transform. We state, without proof, 
that the amplitude distribution in the Fraunhofer diffraction pattern is the Fourier 
transform of the aperture function. Conversely, from Fourier theory, there is an 
inverse transform such that the amplitude distribution in the aperture (or pupil) 
of an optical system is the Fourier transform of the amplitude in the image plane. 
Thus there is a Fourier transform pair connecting the aperture and the Fraunhofer 
image plane. 


10.5.b. EXAMPLE: SPIDER IN CASSEGRAIN TELESCOPE 


The entrance pupil of most two-mirror telescopes is an annulus plus a four- 
legged spider structure, as shown in Fig. 10.18. Therefore the aperture function is 
a clear aperture of radius a, a central obscuration of radius sa, and two bars of 
length 2a (one along č and one along 7) less two bars of length 2ea. Each 
obscuration is given a minus sign because it subtracts from the clear aperture. 
With this aperture function Eq. (10.5.1) becomes 


Ea a 


b 
U(P) = Eq. (10.2.5) — ef. exp (panl | 


—a 


exp (—iy)dé + | 


Ex 


exp (—iy)d¢ | 


b —ta a 
-cÍ exp (ia| | ae (-iB)dn + | Sig (=iBdn}, 
= a 


—a Ei 


(10.5.2) 
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Fig. 10.18. Entrance pupil of typical two-mirror telescope with spider structure. See the text for 
discussion. 


where y = kpé and ß = kqn. The integral on the first line of Eq. (10.5.2) is the 
horizontal part of the spider; the integral on the second line is the vertical part. 
Evaluating these integrals we get 








U(P) = Eq. (10.2.7) — 4ab(1 — oe|= (kgb) sin (kpd/2) sat (w ( + *)a) 





kqb kpd/2 2 
sin (kpb) sin (kqd/2) l+e 
top uiz S (éo( z Ja) (10.5.3) 


where d = (1 — e)a is the length of each of the four bars in the spider and 2b is 
the width of each bar. Note that part of the argument of the cosine is the distance 
from the center of the aperture to the midpoint of each of the bars, (1 + é)a/2. 

With U(P) from Eq. (10.5.3), it is now straightforward to find I(P) = |U(P)|?. 
The amplitude at the peak J) =|U(O)|?, where U(O) = C[na?(1 — &)— 
8ab(1 — £)] = C times the area of the open aperture. 
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10.5.c. ARRAY THEOREM 


The spider in the previous section consists of two pairs of bars with each bar 
displaced from the center of the aperture. As noted following Eq. (10.5.3), the 
amount of this displacement from the center is explicitly part of U(P). If each 
cosine in Eq. (10.5.3) is written in terms of complex exponentials using Euler’s 
relations, then each bar has a factor representing its displacement in either the 
positive or negative direction along one of the axes. This association between 
displacement and a complex exponential suggests a closer look at identical 
multiple apertures or multiple obstacles within some larger aperture. 

Consider a large screen containing N identical apertures with the designated 
center of each at (č, n;) in the (č, n) coordinate frame. Let (Z',n’) be the local 
coordinates of each aperture relative to its center, hence č = č; + nS ny +n. 
With this aperture Eq. (10.5.1) becomes 


foe) 


U(P) = cy \| AE n/yexp [ik(p& +E) +4(n, +1')) Jae’ dn! (10.5.4) 


—00 


where A(¢', n’) is the aperture function for a single hole. 
Factoring exp [ik(p¢; + qn;)] from each integral in Eq. (10.5.4) we get 


0O 


N 
U(P) =C {| ACE's nP) expl spe + an de di’ x X exp lk(DE, + an) 


(10.5.5) 


where each term in the sum locates the center of one of the N apertures. 

Equation (10.5.5) is a statement of the array theorem: the amplitude at point P 
in the Fraunhofer diffraction pattern of an array of identical apertures (or 
obstacles) is the Fourier transform of an individual aperture function times a 
function representing the positions of the aperture centers in the diffracting 
screen. 

In our example of a spider on an annular aperture there are two aperture 
functions, a horizontal bar of length (1 — e)a and width 2b and a vertical one of 
the same dimensions, with each displaced by (1 + €)a/2 along its long 
dimension. The net result is the pattern shown in Fig. 10.18. The reader can 
verify that applying Eq. (10.5.5) to this example gives U(P) in Eq. (10.5.3). 
Another example showing the utility of the array theorem is that of the HST 
pupil, the pupil shown in Fig. 10.18 plus three circular pads near the outer edge of 
the primary. The HST pupil, with coordinates and dimensions, and U(P) 
calculated from Eq. (10.5.5) are found in a paper by Schroeder and Golimowski 
(1996). 
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The array theorem is useful in any case where there are multiple diffracting 
apertures or obstacles. We make use of the array theorem in discussing telescope 
arrays in Chapter 18. 


10.5.d. CONCLUDING REMARKS 


The reciprocal nature of the integrals in a Fourier transform pair is a 
mathematical consequence of the fact that the propagation of light through an 
optical system is reversible. The utility of Eq. (10.5.1) and its inverse transform 
were strikingly evident when the first aberrated images from HST were examined. 
An amplitude distribution derived from the observed PSF could be used to find 
the aberration part of the aperture function. Conversely, a host of aberration 
functions could be inserted into Eq. (10.5.1) and the computed /(P) for each 
compared to the observed PSF. These calculations led the way to recognition of 
the significant spherical aberration in the HST primary. We give some of the 
quantitative results from this analysis in Chapter 11. 

We also see Fourier transform pairs occurring in our discussion of transfer 
functions, the first topic of Chapter 11. The proofs of these statements, along with 
extensive discussions of the connection between Fourier theory and optics, can be 
found in many intermediate optics texts, for example, by Hecht (1987). The text 
by Gaskill (1978) is also a useful source of information on Fourier optics. 
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Chapter 11 Transfer Functions; Hubble Space 


Telescope 


The results in the preceding chapter provide a complete description of the 
characteristics of a perfect or near-perfect image of a distant point object. The 
response of an optical system to a set of point objects or, more generally, an 
arbitrary intensity distribution was not considered in that analysis. Clearly this 
response depends on factors in addition to the PSF such as, for example, blurring 
due to image motion or detector pixel size. Factors such as these are most easily 
included by using the theory of transfer functions to describe the system response 
and image characteristics. 


11.1. TRANSFER FUNCTIONS AND IMAGE CHARACTERISTICS 


This approach to image analysis makes use of a complex function called the 
optical transfer function or OTF, with the real part of the OTF called the 
modulation transfer function or MTF. One advantage of this approach is that 
each independent component of a complete system, from the atmosphere to the 
detector, has its own OTF, and the system OTF is the product of the separate 
OTFs. This separation also applies to different types of wavefront error, with 
separate OTFs for geometric aberrations, random wavefront errors, and blurring 
due to image motion. The response of the system to an incident wavefront is 
determined by the system OTF comprising all these factors. 
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In this section, following a discussion of basic concepts, we draw upon results 
derived from the theory of transfer functions and show how they are used to 
determine image characteristics. For derivations and discussion of the theory, the 
reader should consult references given at the end of the chapter. 


11.1.4. DEFINITION OF THE TRANSFER FUNCTION 


The concept of the transfer function is most easily seen by assuming a specific 
object intensity distribution. Consider a set of equally spaced line sources whose 
intensity in a direction perpendicular to the lines varies sinusoidally, as shown in 
Fig. 11.1(a). Two parameters that describe this source are the spacing between the 
lines and the contrast. We let pọ denote the spacing, or spatial period, where 
vo = 1/po is the spatial frequency in cycles per unit length. The contrast C, of the 
object, in the notation of Fig. 11.1(a), is defined as 


Inas — Imi 
C, = mx imin | (11.1.1) 
Tmax + Lmin 


where C, is assumed independent of vo. 

Assuming an optical system of constant magnification, the image of this object 
is also a sinusoidal intensity distribution, as shown in Fig. 11.1(b). Because each 
object point is imaged as a blur given by the PSF (or line spread function in one 
dimension), the image intensity is the superposition of all the individual spread 
functions. This addition of intensities assumes the illumination is incoherent. 

We let p and v denote the spatial period and frequency, respectively, at the 
image surface. The contrast C, in the image is defined according to Eq. (11.1.1), 
with maximum and minimum intensities substituted. For a system with magni- 
fication m, we have p = pom and v = vọ/m. If the object distance is infinite, the 
spatial period and frequency of the object become angular period and frequency, 
with corresponding angular units. The image can also be described in angular 
terms in this case. 

The modulation transfer function T is a measure of the change in contrast 
between the object and image, defined as 


C; _ contrast in image at v 
C, contrast in object 





T(v) = (11.1.2) 
Given that each object point is imaged as a blur described by the point or line 
spread function, we expect 7(v) < 1 for all spatial frequencies. We also expect to 
find that T(v) —> 1 as v —> 0 and T(v) —> 0 as v approaches the resolution limit 
set by the width of the PSF. The spatial frequency at which contrast in a perfect 
image goes to zero is called the cutoff frequency v. All information at frequencies 
higher than the cutoff frequency is lost. 
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= io ee 








(b) 


(c) 


Fig. 11.1. (a) Object sine wave intensity; (b) image intensity profile, unshifted; and (c) image 
intensity profile, shifted. 
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To determine the approximate cutoff frequency we apply the Rayleigh criterion 
of resolution to diffraction-limited images whose profiles are given by Eq. 
(10.2.8). As given in Section 10.2.e, this criterion states that two images of 
equal intensity are just resolved when the peak of one coincides with the first 
minimum of the other. For annular apertures Eq. (10.2.22) gives the angular limit 
of resolution as 1.22y4/D, or approximately 2/D. The sum of two profiles like 
that in Fig. 10.5 at this separation gives an intensity midway between the peaks of 
approximately 0.8 that of either peak, and the peaks are “just resolved.” For a 
rectangular or square aperture, the corresponding angular separation of peaks of 
equal brightness that are just resolved is 4/b, where b is the aperture dimension 
parallel to the line joining the peaks of the PSFs. The intensity midway between 
the peaks is again approximately 0.8 that of either peak. 

For an object at infinity, the angle 1/D corresponds to a linear separation of 
f4/D at the image surface, or a spatial frequency of 1/AF. A rigorous derivation 
shows that the cutoff frequency v, = 1/AF, in linear units, with a corresponding 
cutoff frequency in angular units of D/A. A good introduction to the theory of 
transfer functions, including derivation of the cutoff frequency, is given by Smith 
(1963). It is convenient to define the normalized spatial frequency v,, as 


Va = V/Ve = Vo/Voes (11.1.3) 


where the range of this parameter is zero to one. Comparisons of different optical 
systems, or the same system at different wavelengths, are most often made in 
normalized units. 

In addition to reduced contrast, the intensity pattern may also be shifted 
laterally on the image surface, as shown in Fig. 11.1.c. This shift occurs if 
asymmetric aberrations, such as coma, are present. If the linear shift on the image 
surface is 6, the phase transfer function ®, is defined as 


®, = 2n6/p. (11.1.4) 


A combination of Eqs. (11.1.2) and (11.1.4) leads to the definition of the complex 
optical transfer function Y(v) as 


Y(v) = T(v) exp[i®,(v)], (11.1.5) 


where each independent component of a system has its own Y(v). The two 
mirrors of a Cassegrain telescope, for example, are considered a single compo- 
nent because the image quality is determined by the mirror combination. 

Given these definitions it is possible, in principle, to determine the response of 
a system to any object intensity distribution. From the theory of Fourier analysis, 
one finds that any such distribution can be synthesized by some combination of 
sinusoidal functions of different frequencies. The transformation of each harmo- 
nic component of the object into the corresponding harmonic part of the image is 
determined by Y at that frequency. 
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An alternative method of finding the OTF is by calculating the autocorrelation 
of the pupil or aperture function. The pupil function for a perfect system is the 
transmittance, usually constant, within the boundaries of the exit pupil and is zero 
outside. For a system with aberrations, the pupil function is complex and includes 
aberrations, as noted in Section 10.5. The autocorrelation integral is essentially 
one that gives the area of overlap between two pupil functions, with one shifted 
relative to the other by an amount proportional to the spatial frequency. The 
reader should consult the references by both Born and Wolf and by Wetherell 
cited throughout this book for discussion of this approach to calculating the OTF. 

The discussion in this section is intended as an introduction to the basic 
characteristics of the transfer function. We now turn our attention to the relation 
between the transfer function and image characteristics for the important case 
where the PSF is symmetric about the system axis. 


11.1.b. POINT SPREAD FUNCTION AND ENCIRCLED ENERGY 


The relations between image characteristics and the transfer function are 
derived using the theory of Fourier transforms. Given a PSF computed by the 
methods described in Section 10.2, the OTF is defined as the Fourier transform of 
the PSF. Because the PSF and OTF are a Fourier transform pair, the former can be 
calculated if the latter is known. For our purposes, we consider only the case 
where the phase transfer function ®, is zero and the OTF reduces to the MTF. 
This limitation rules out the treatment of asymmetric aberrations such as coma. 

In rectangular coordinates the MTF is given by 


T(v,, vy) =A I i(x, y) exp [—2zi(v,x + v,y)]dx dy, (11.1.6) 


where v, = vcosy, v, = vsiny, x and y are given in Eq. (10.2.1), and A is a 
normalization factor chosen to give T = 1 at v= 0. 
In polar coordinates the corresponding relation to Eq. (11.1.6) is 


2% poo 
T, y) = i | ilr, w) exp [—2zivr cos (W — y)|r dr dy, (11.1.7) 


where y can be assigned any convenient value for the special but important case 
where the PSF is symmetric about the system axis. Letting y = a, the integration 
over y in Eq. (11.1.7) is one of substituting the integral form of Jọ, as done with 
Eq. (10.2.6). 
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Given a symmetric PSF and a circular aperture with a central obscuration, the 
methods of Fourier transforms give the following relations between the PSF, EE, 
and MTF: 


Tv) = FoF i(a) Jo(2nvaa da, (11.1.8) 
0 
PSF(«) = _ 82 if T(v) Jo(2nva)v dv (11.1.9) 
-eD 6 "°° i 
EE(«) = 270 [ T(v) J, (2nva)dy, (11.1.10) 
0 


where v is the frequency in angular units, « is the angular radius of the image, and 
Jy and J, are Bessel functions. Because v is given in angular units, the cutoff 
frequency v, in these units is D/A. Equations (11.1.8)—(11.1.10) can be written in 
linear units by substituting fv (linear) for v (angular) and r for «, where r = fa. 
The factors outside the integrals in Eqs. (11.1.8)—(11.1.10) are normalization 
factors, with T(0) = 1 and PSF(0) = 1 for a perfect image, and EE(co) = 1. 

Note the reciprocal relationship between Eqs. (11.1.8) and (11.1.9). Given the 
point spread function i(«) we can find T(v) or, conversely, given T(v) we can 
compute i(a). 

For ease of calculation and comparison of results for different systems or 
wavelengths, it is useful to rewrite these relations in terms of the normalized 
frequency v„. The results are 


21 _ 92) foo 
T(v,) = | i(w) Jo(2nv,w)w dw, (11.1.11) 
2 0 
8 1 
PSF(w) = Gay | T(v,)Jo(2nv,w)v, dvp, (11.1.12) 
1 
EE(w) = 2nw | T(v,) Ji (22v,w)dv,,, (11.1.13) 
0 


where w = «D/A = av,. Comparing the argument of each Bessel function with 
Eq. (10.2.10), we see that wx = v, the dimensionless parameter used in Section 
10.2. 

For calculations of PSF and EE, all that is needed is the MTF. The general 
expression for the MTF of a perfect circular pupil with a central obscuration, 
taken from Appendix B of the reference by Wetherell (1980), is given in slightly 
modified form in Table 11.1. For a clear circular aperture the factors B and C are 
zero. Substituting 24/7 from Table 11.1 into Eq. (11.1.12), it is a simple 
calculation to verify that PSF(0)= 1 for a clear aperture, as required by 
normalization. 
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Table 11.1 


Modulation Transfer Function for Perfect Lens with Central Obscuration 


_ 2 (A+B+C) 
TW) = =P) 
A =[cos“! v, ~ v,(1 — v)'], O<v,<1 
v v v,\2 7/2 
B= o (2) - Ghi- C] ; O<v, <é 
E V, > € 
C 2 0 < v, < (1 — €)/2 
= nE", 
= -ne + fesnz+ža +2)- (l - Pan! [EEn], (1 — €) < 2v, < (1 +8) 
=0, 2v, > (1+8) 





cos- (1+ g — 4v? 
x= 2e 


Modulation transfer functions for selected values of ¢ are shown in Fig. 11.2. 
The main effect of a larger central obscuration is a decrease in the MTF in the 
middle of the frequency range. This is expected because the effect of the 
obscuration on the PSF is to put more energy into the first bright ring of the 
Airy pattern, and the contrast in the image of an extended object is reduced 
because of the larger fraction of energy in this ring. For spatial frequencies near 
the cutoff frequency, on the other hand, the MTF is slightly larger when the pupil 
has an obscuration. This is also expected because the FWHM of the Airy disk is 
smaller for larger ¢, and the “sharper” peak implies a smaller limit of resolution 
according to the Rayleigh criterion. 

When 7(v,,) from Table 11.1 is substituted into Eqs. (11.1.12) and (11.1.13), 
and the equations are integrated numerically, results like those shown in Figs. 
10.5 and 10.8 are obtained. For a perfect image it is obviously easier to use Eqs. 
(10.2.8) and (10.2.17) to find the PSF and EE, respectively, but in the presence of 
aberrations it is usually easier to use the MTF approach. 

The calculation of MTFs in the presence of symmetrical aberrations is done by 
either evaluating the autocorrelation integral, with the aberrations included in the 
pupil function, or integrating Eq. (11.1.11) with i(w) for the aberrated image from 
Eq. (10.3.4). For images with defocus computed from Eq. (10.3.4), as shown in 
Figs. 10.12—10.14, the MTF curves are shown in Fig. 11.3. For a spherically 
aberrated image, as shown in Figs. 10.15-10.16, MTF curves at selected foci are 
shown in Fig. 11.4. 
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Fig. 11.2. Normalized modulation transfer function for several obscuration ratios calculated from 
relations in Table 11.1. 
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Fig. 11.3. Modulation transfer functions for perfect system (e = 0) with defocus calculated from 
Eq. (11.1.11). The rms wavefront error œ is given in units of waves. 
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Fig. 11.4. Modulation transfer functions for unobstructed system (e = 0) with spherical aberra- 
tion calculated from Eq. (11.1.11). Curves are given for paraxial, diffraction, and marginal foci. 


11.1.c. MODULATION TRANSFER FUNCTIONS FOR OTHER 
WAVEFRONT ERRORS 


In addition to wavefront errors due to classical aberrations, often called figure 
errors, we noted in the preceding that random errors on a finer scale due to the 
polishing process may also be present. Another source of image degradation is 
motion of an image due to effects from outside the optical system. Each of these 
nonfigure error contributions can be modeled with a factor in the MTF that is a 
statistical average of the effect. In this section we give an overview of some of 
these MTF models and their effects on the PSF and EE. 

The proscribed way of including additional, independent MTF factors is to 
write the system MTF as a product of independent factors in the form 

‘T = 1,1;T,T,, (11.1.14) 
where T is the system MTF, T; is the MTF for a perfect system, as given in Table 
11.1, and the remaining factors are degradation functions. The subscripts f, r, and 
p denote, in turn, contributions due to figure, random, and pointing errors. It is at 
this point where the advantage of Fourier transforms and the transfer function 
approach is most evident. If a degradation of the wavefront can be described 
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mathematically, then its MTF can be computed and included in the system MTF 
by a simple multiplication, provided this degradation is independent of all others. 
Complex wavefront degradation can therefore be reduced to a relatively simple 
set of separate contributions. 

In considering random errors on a wavefront, we assume that all figure errors 
have been subtracted from the wavefront map at the exit pupil. Figure error is 
usually taken to be those components with spatial frequencies less than 
5 cycles/radius over the pupil. We also assume that the remaining wavefront 
errors of higher spatial frequency are distributed in a random fashion over the 
residual wavefront. The choice for the upper limit to the spatial frequency 
depends on the size of the spatial period selected. For the 2.4-m primary 
mirror of the Hubble Space Telescope, a spatial period of 1 mm corresponds to 
a spatial frequency of 1200 cycles/radius. Random error in this middle range of 
spatial frequencies is often termed ripple. 

A Statistical analysis of this type of error has been made by O’Neill (1963). 
The result of this analysis is an MTF degradation factor in the midfrequency 
range of the form 


T,, = exp {—k?w2[1 — c(v,)]}, (11.1.15) 


where k = 27/2, w,, is the rms random wavefront error for midfrequencies, and 
c(v,) is the normalized autocorrelation function of the residual pupil function. 
The characteristics of c(v,,) are such that c(0) = 1 and c(v,,) > 0 for a large shift 
of the residual wavefront in the autocorrelation integral. If the function c(v,,) is 
modeled as a Gaussian of the form c(v,) = exp (—4v2//*), as given by Wetherell 
(1980), the degradation function 7,, has the form shown in Fig. 11.5. The 
parameter / is the normalized correlation length and is a measure of the structure 
on the wavefront. To a rough approximation, the spatial period of the dominant 
structure is 1//cycles/diameter. For discussion of other forms of c(v,) and 
comparison with measured results, the reader should consult the reference by 
Wetherell cited here. 

Wavefront errors with high spatial frequencies, those larger than ones 
associated with ripple, are ascribed to microstructure on an optical surface, and 
often called microripple. The degradation function for high-frequency micro- 
ripple follows directly from Eq. (11.1.15) if we let Z — 0 in the autocorrelation 
function. In this limit 


T, = exp (—k° o} (11.1.16) 


at all spatial frequencies except v, = 0, where œw, is the rms wavefront error due 
to microripple. The product of Eqs. (11.1.15) and (11.1.16) is the degradation 
function T, in Eq. (11.1.14). 

Degradation of an image due to random motion has been discussed by several 
authors, including Mahajan (1978) and Wetherell (1980). The starting assumption 
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Fig. 11.5. Midfrequency degradation factor T„ with Gaussian correlation factor calculated from 
Eq. (11.1.15). The rms ripple error is 0.1 waves; / is the normalized correlation length. 
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Fig. 11.6. Pointing degradation factor T, for several normalized rms pointing errors calculated 
from Eq. (11.1.18). 
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of this analysis is an image motion that is rotationally symmetric and described by 
the unnormalized probability function 


P(r) = exp (—a”/20”), (11.1.17) 


where o’ is the standard deviation and « is radius of the excursion of the image 
from the mean position, both in angular units. If we normalize o’ by multiplying 
by the cutoff frequency D/A, then the pointing degradation function, as shown by 
Mahajan, is 


T,(v,) = exp (—2n?0° v2). (11.1.18) 


Because 7, decreases as v, increases, it is evident that the effect of this 
degradation function is to depress the MTF more at higher spatial frequencies. 
Figure 11.6 shows the pointing degradation function for several values of o. For 
an otherwise perfect system, the product of curves in Figs. 11.2 and 11.6 gives 
the system MTF with random pointing error. 
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Fig. 11.7. Point spread function for obstructed system (e = 0.33) with rms midfrequency error œ 
and / = 0.04. Results are calculated from Eq. (11.1.12). 
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11.1.d. EXAMPLES 


We now illustrate the results of the previous section by giving examples of 
PSFs and EEs calculated from Eqs. (11.1.12) and (11.1.13) with different 
degradation factors multiplying 7, the diffraction MTF. All of the results 
given are for € = 0.33, the obscuration ratio of the HST, with zero figure error. 
A more complete discussion of the expected image characteristics of HST, with 
all factors taken together, follows in the next section. 

Figures 11.7 and 11.8 show PSF and EE for a pupil wavefront with random 
error of the type described by Eq. (11.1.15), for three values of w. The 
approximate correlation length assumed for these calculations is 0.04 cycles/dia- 
meter. Relative to the PSF for a perfect system, given in Fig. 10.5, the effect of 
this error is to depress the disk and inner ring and raise the outer rings. The Strehl 
intensity is given by Eq. (10.3.9). The transfer of energy outward from the center 
of the Airy pattern is clearly shown in Fig. 11.8. Taking values of EE at the right- 
hand side of Fig. 11.8, we see that nearly five times as much energy is outside the 
fourth bright ring when w = 0.14, compared to a perfect image. 
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Fig. 11.8. Encircled energy fraction for obstructed system (e = 0.33) with rms midfrequency 
error œ and / = 0.04. Results are calculated from Eq. (11.1.13). 
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Fig. 11.9. Point spread function for perfect system (e = 0.33) with pointing error. The ø is the 
normalized rms Gaussian error. Results are calculated from Eq. (11.1.12). 


When high-frequency microripple is present, as described by Eq. (11.1.16), 
the effect is to depress the PSF by the factor T, at all image radii. This occurs 
because T, is independent of spatial frequency, hence it can be taken out of the 
integral in Eq. (11.1.12). The effect of microripple on EE is similar, for the same 
reason. In theory, therefore, the energy scattered by microripple error disappears; 
in practice the energy is scattered at angles large compared to the Airy disk 
diameter. 

Figures 11.9 and 11.10 show PSF and EE for a perfect system with pointing 
error described by Eq. (11.1.18), for three values of ø. The effect of increased 
pointing error is clearly one of reducing the Strehl intensity, smoothing the PSF 
pattern, and distributing a given fraction of the encircled energy over a larger 
area. For the values of o shown, the redistribution of energy takes place largely 
between the disk and first bright ring. With specific reference to HST, the curve 
with o = 0.1 corresponds to an rms pointing error g’ on the sky of 0.005 arc-sec, 
at J = 580 nm, with o’ = o(A/D). Because ø is inversely proportional to 4 for a 
given a’, the curve with ø = 0.3 corresponds to the same pointing error at 
A = 190nm. As expected, a given pointing error on the sky has a greater effect on 
a “sharper” image. 
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Fig. 11.10. Encircled energy fraction for perfect system (e = 0.33) with pointing error. The ø is 
the normalized rms Gaussian error. Results are calculated from Eq. (11.1.13). 


With these examples it should be clear that the MTF approach is a powerful 
technique, especially when errors other than figure errors are present. We 
conclude our introduction to the MTF approach in Section 11.4 following a 
discussion of the Hubble Space Telescope. 


11.2. HUBBLE SPACE TELESCOPE, PRELAUNCH EXPECTATIONS 


NOTE: This section on the Hubble Space Telescope is essentially unchanged 
from the version that appeared in the 1st edition of this book. What was presented 
there, and is repeated here, represented the best estimates of the expected optical 
performance of HST following its launch. As is well known, HST did not perform 
up to expectations because of a primary mirror with the wrong conic constant. As 
is also well known, HST was given an optical “fix” that corrected for the 
spherical aberration introduced by the primary and the performance was 
restored, or nearly so, to that of prelaunch expectations. With this fix the results 
given here are again valid, hence the reason for leaving this section largely 
unchanged. A discussion of the postlaunch reality of HST follows in Section 11.3. 


292 11. Transfer Functions; Hubble Space Telescope 


The HST will be the first large astronomical observatory in space with a 
resolution capability an order of magnitude better than is possible with ground- 
based telescopes in the visible and near ultraviolet. This unique facility will 
enable astronomers to make observations not possible from the ground and obtain 
data needed to answer many fundamental astronomical questions. Given this 
promise, a brief description of HST and the expected image characteristics is in 
order. 


11.2.a. BASIC CONFIGURATION 


The HST is a 2.4-m Cassegrain telescope of the Ritchey-Chretien type, with 
the nominal parameters given in Table 11.2. The performance goals set by NASA 
at the start of the project include spectral coverage from 115nm to the far 
infrared, with diffraction-limited performance at visible wavelengths. Analysis of 
the completed system shows that HST is expected to meet or exceed the stated 
goal of 4/20rms wavefront error at à = 633 nm on the axis of the f/24 focal 
surface. 

The complete observatory includes the following complement of instruments: 
wide-field/planetary camera (WFPC), faint object camera (FOC), faint object 
spectrograph (FOS), high-resolution spectrograph (HRS), and high speed photo- 
meter (HSP). The fine guidance system (FGS) of the telescope will also be used 
for astrometric observations. For details on these instruments and their observing 
modes, the reader should consult the references at the end of the chapter, 
especially the Instrument Handbook distributed by the Space Telescope Science 
Institute. 


11.2.6. ON-AXIS IMAGE CHARACTERISTICS 


In this section we describe the expected on-axis image characteristics at the 
Jf /24 focal surface. All of the results presented assume the mirrors are clean with 
no scattering due to dust. 


Table 11.2 


Nominal Design Parameters of Hubble Space Telescope 


Primary: D = 2400 mm, R, = —11040 mm, f/2.30 
K, = —1.0022985 

Secondary: R, = —1358 mm, K, = —1.496 

Overall: m = 10.435, B = 0.2717, k = 0.1112, f /24, obscuration ratio € = 0.33 
scale = 3.58 arc-sec/mm = 279 pm/arc-sec 
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Analysis of HST performance proceeds along the lines described in the 
previous section. Each independent component is described by an MTF degrada- 
tion function, and the product of these functions and the diffraction MTF is used 
as the basis of calculations of the image characteristics. Contributors to Ty in Eq. 
(11.1.14) include the aberrations of the mirrors, misalignments of the mirrors, 
thermal changes in orbit, ground-to-orbit changes, and errors of the optical 
system used to measure the wavefront in orbit. The errors that remain after figure 
errors are removed from the wavefront map are used to calculate T,, in the form 
given in Eq. (11.1.15). Surface errors derived from measurements on small parts 
of the mirrors are modeled as high-frequency errors in the form given in Eq. 
(11.1.16). The product of these functions, to which the figure error is the largest 
contributor, gives the system degradation function in the absence of pointing 
error. This combination leads to an overall rms system wavefront error of 
approximately 1/21 at a wavelength of 633 nm. 

The product of the system degradation function with the diffraction MTF and 
the pointing degradation factor given in Eq. (11.1.18) gives the rotationally 
symmetric system MTF used in Eqs. (11.1.12) and (11.1.13). All of the following 
results are derived from calculations using these relations, with a nominal rms 
pointing error of 0.007 arc-sec assigned to image motion. 

Figures 11.11 and 11.12 show PSFs at a number of wavelengths, with EE for 
each of these wavelengths shown in Figs. 11.13 and 11.14. It is evident from 
these curves that the PSFs show progressive degradation at shorter wavelengths. 
The ring structure in the Airy pattern, clearly seen in the visible and infrared 
wavelengths, is absent at the shortest wavelengths. This is a result both of 
pointing error and nonfigure contributors to the degradation function. We also see 
that the level of the PSF decreases in the ultraviolet, a consequence of the mid- 
and high-frequency components in the degradation function. 

The Strehl ratio S and the FWHM of the image peak are shown in Fig. 11.15. 
The most notable feature of the FWHM curve is the limiting core diameter of 
about 0.023 arc-sec at the shortest wavelengths. Figure 11.16 shows the peak 
intensity J, as a function of wavelength, normalized to unity at 4 = 633 nm, 
assuming equal flux at each wavelength. The intensity at the peak is given by Eq. 
(10.2.19) with S included for a degraded image. Therefore 


h(a) _ 8, (633° 
KG) San” eee 





wher A is in nanometers and S is given in Fig. 11.15. Also shown in Fig. 11.16 is 
the average intensity over an area enclosing 60% of the total energy. These results 
are derived using Eq. (10.2.20) with 4 = 0.6 and image radii taken from Figs. 
11.13 and 11.14. The curves in Fig. 11.16 would show a A dependence for a 
perfect image; the actual curves show a peak in the ultraviolet. 
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Fig. 11.11. Predicted PSFs for Hubble Space Telescope (HST). The rms wavefront error is 1/21 at 
633 nm; the rms pointing error is 0.007 arc-sec. 


Extension of PSF calculations to larger image radii than those shown in Figs. 
11.11 and 11.12 shows that the average intensity far from the Airy disk falls off as 
a3, as for a perfect image. However, the intensity level is higher than that of a 
perfect image by an amount that depends on the wavelength. Comparing the 
average PSF at a radius of | arc-sec calculated from Eq. (11.1.12) with that given 
by Eq. (10.2.12), we get the results shown in Fig. 11.17. The increasing spread 
between the curves in Fig. 11.17 at shorter wavelengths is largely a consequence 
of the mid- and high-frequency factors in the degradation function. As noted in 
Section 11.1, the effect of these factors is to transfer energy from the inner region 
of the Airy pattern to the wings. 

These characteristics for the HST images are based on extensive modeling and 
represent the best estimate of what can be expected once HST is in orbit. 
Predictions of the effect on the PSF and EE from dust on the mirrors have been 
made, with the results predicting some additional fraction of light scattered into 
the image wings. This fraction is uncertain because it is sensitive to the size 
distribution among the dust particles. Definitive image characteristics will only be 
known after extensive observations in space. 
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Fig. 11.12. Predicted PSFs for HST. The wavefront and pointing errors are given in the caption of 
Fig. 11.11. 
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Fig. 11.13. Predicted EEs for HST. The wavefront and pointing errors are given in the caption of 
Fig. 11.11. 
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Fig. 11.14. Predicted EEs for HST. The wavefront and pointing errors are given in the caption of 
Fig. 11.11. 
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Fig. 11.15. Predicted Strehl ratio and FWHM for HST. The wavefront and pointing errors are 
given in the caption of Fig. 11.11. 
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Fig. 11.16. Peak intensity of HST normalized to unity at 633 nm and average intensity per unit 
flux on area enclosing 60% of the encircled energy. 
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Fig. 11.17. Predicted average PSF for HST at larc-sec from image peak. The wavefront and 
pointing errors are given in the caption of Fig. 11.11. 
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NOTE: The material in Section 11.2.c of the Ist edition has been incorporated 
into Section 10.4 of this edition. The preceding paragraph appeared in Section 
11.3 of the Ist edition and is now located where it properly belongs. A discussion 
related directly to the last sentence of the preceding paragraph follows. 


11.3. HUBBLE SPACE TELESCOPE, POSTLAUNCH REALITY 


Contrary to the predicted diffraction-limited performance, actual images taken 
with HST shortly after launch were seriously deficient in quality, even after 
repeated attempts to improve the quality by refocusing the telescope. After careful 
analysis of the characteristics of the images obtained during the weeks following 
launch, it became clear that the image degradation was a consequence of 
significant spherical aberration introduced by the primary mirror. The error in 
the primary is essentially one of an incorrect conic constant, a consequence of 
errors in the null mirror/lens system used in the manufacture of the primary. An 
interested reader can learn more about the detailed characteristics of these 
aberrated images and the null system in a paper by Burrows (see Burrows, 1991). 

For our purposes the details of the analysis leading to the identification of 
spherical aberration due to the primary as the cause for the poor images is 
unimportant. Rather, we take this as given and use both the geometric and 
diffraction theory of aberrations to determine the consequences of the spherical 
aberration on the image. 

The starting point for our analysis is the spherical aberration coefficient B, for 
a Cassegrain telescope from Table 6.5. Using the nominal parameters of HST 
from Table 11.2 we find B, = 0, as expected. If the conic constant K, is changed 
to Kj, the change in B; is 


Ki -K _ 6K, 








6B, = =, 11.3.1 
3 4R? 4R? ( ) 
hence 
ôK, (a\? ôK 
4 1 1 
Aaso = OB3a = ES ($) a= T 256F a. (11.3.2) 


The first part of Eq. (11.3.2) is taken from Table 10.8. 

Analysis of the aberrated images gave Ki = —1.0140+ 0.0005, hence a 
change 6K, = —0.0117. Substituting 6K, and parameters from Table 11.2 into 
Eq. (11.3.2) gives a4 = 4.51 um as the magnitude of the wavefront error at the 
edge of the HST primary, hence a surface error of 2.25 um. With ôB, > 0, the 
marginal focus lies farther from the secondary mirror than the paraxial focus. 
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This is expected from a primary that is “flatter,” that is, with a more negative 
conic constant. 

Given the wavefront error Aa4g = 4.51 um and £e = 0.33, we find the rms 
wavefront error and blur diameter at diffraction focus and distances from paraxial 
focus as follows, 


œ = 0.30 um = 0.42 waves at 2 = 633 nm, 

angular diameter at diffraction focus = 4Aa4)/a = 3.1 arc-sec, 
paraxial focus > marginal focus = 162a49F? = 41.6 mm, 
paraxial focus — diffraction focus = 8Aa4)F7(1 + £) = 23.1 mm, 


with the necessary relations taken from Table 10.6, Table 10.8, Section 10.3.c, 
and Table 10.4, in turn. The overall system is clearly far from diffraction-limited. 

We first examine the images as seen in spot diagrams. Figure 11.18 shows spot 
patterns at equal separations, starting with paraxial focus at the left and ending 
with marginal focus at the right. Note that the smallest overall image is at the 
circle of least confusion, as expected for large spherical aberration, with a 
diameter of approximately 1.6 arc-sec. 

From the point of view of diffraction theory, the optimum image location is not 
the circle of least confusion, but at diffraction focus where the rms wavefront 
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Fig. 11.18. Spot diagrams for the aberrant on-axis image of HST following launch. Paraxial focus 
(PF); diffraction focus (DF); marginal focus (MF). The scale bar on the left is 6 arc-sec long. 
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error is a minimum. Calculations of the encircled energy fraction show that, 
within an image radius of 0.5 arc-sec at 4 = 633 nm, EE is about 0.65 for 
diffraction focus versus approximately 0.4 for the circle of least confusion. 

Although the choice of the optimum focus as diffraction focus seems evident, 
further considerations led to setting the focus at that position giving the maximum 
encircled energy fraction in 0.1 arc-sec radius at A = 486nm. This setting is 
nearly 10mm removed from diffraction focus and in the direction of paraxial 
focus. Curves of EE for the chosen focus and diffraction focus are shown in Fig. 
11.19 for 2 = 633 nm. These curves were computed using Eq. (10.1.10) with i(P) 
calculated from Eq. (10.3.4). From Fig. 11.19 we see that EE is 0.15 within 
0.1 arc-sec radius at the chosen focus. 

Cross sections of i(P) at 2 = 633 nm for these two foci are shown in Fig. 
11.20. Note the relatively sharp central peaks present in Fig. 11.20, with the peak 
at the chosen focus noticeably wider than the one at diffraction focus. Note that 
the first few rings surrounding the main peak at the chosen focus are also more 
widely spaced than those for diffraction focus. These results are understandable 
because the main contributions to the main peak for an image near paraxial focus 
come from the inner part of the annular aperture. Because the width of the 
contributing annulus is narrower, the diffraction peak is broader. 

Conversely, the light from the outer part of the annulus, where the wavefront 
error is larger, is primarily responsible for the broad but faint plateau of light in 
which the main peak is centered. From geometric considerations, the radius of the 
spot pattern at the chosen focus is about 2.2 arc-sec, while the diffraction PSF in 
Fig. 11.20 shows a falloff in intensity starting at approximately 1.5 arc-sec. 
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Fig. 11.19. Encircled energy curves for aberrant HST image at two focal positions: maximum 
encircled energy at 0.1 arc-sec (solid); diffraction focus (dashed). 
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Fig. 11.20. The PSFs for aberrant HST image at two focal positions: maximum encircled energy 
at 0.1 arc-sec (solid); diffraction focus (dashed). 


Because of the dramatic difference between the expected encircled energy 
fractions in Fig. 11.13 and those in Fig. 11.19, the performance of HST and its 
suite of instruments was severely compromised. In the months following the 
launch, a set of possible optical “fixes” of the spherical aberration was studied. 
The chosen solution has been discussed briefly in Sections 2.6.c and 6.5.b, 
namely, that of the addition of two mirrors following the HST secondary. The first 
of these two mirrors reimaged the entrance pupil (primary mirror) of HST onto 
the second mirror, where the optical correction was made. In effect, the surface 
error at the primary, (B; /2)(pa)* from Eq. (11.3.1) for a ray at radius pa on the 
primary is put on the second mirror, but with opposite sign, at radius pr,. With 
this correction, rays through the aberrant telescope and the two-mirror add-on 
again satisfy Fermat’s Principle and spherical aberration is eliminated. 

The implementation of this solution was specific to each of four instruments 
on HST. A separate pair of mirrors was configured for each of three axial 
instruments, the FOC, FOS, and HRS, and mounted in a separate module called 
COSTAR (for Corrective Optics, Space Telescope Aberration Recovery). With 
COSTAR in place in HST, each pair of mirrors was positioned ahead of the focal 
surface and redirected an incident beam in the proper direction. Thus the light 
entering the apertures of the spectrographs or camera was free of spherical 
aberration and the full capabilities of these instruments could be realized. 
~ In the case of WFPC, the radial instrument, implementation of the fix required 
modifying the original optical design slightly to ensure that the reimaged pupil 
was exactly on the secondary mirror of the instrument cameras. Because of the 
potentially serious problems due to pupil shear from a decentered pupil, discussed 
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in Section 6.5.b, WFPC2 was limited to four cameras rather than eight as in the 
original WFPC. Fortunately, the installation of COSTAR and WFPC2 in orbit was 
spectacularly successful and HST has realized its original promise. 


11.4. CONCLUDING REMARKS 


The theory, as presented in Chapters 10 and 11, is an introduction to the main 
features of the diffraction theory of aberrations, but for many systems this is 
sufficient for design purposes. Relations for orthogonal aberrations of higher 
order have been derived and these are used in designing optical systems of the 
highest precision. The reader should consult the literature for information on 
these refinements. 

The discussion in Section 11.1 leading up to Eq. (11.1.6) was general, but the 
derivation of Eqs. (11.1.8) through (11.1.13) required a symmetric aperture 
function and PSF. Returning to the general case, we outline the approach for an 
arbitrary aperture function. 

The starting point for an arbitrary aperture function is Eq. (10.5.1) with A(€, n) 
including a term exp(ik®) for aberrations. The optical path difference ® usually 
covers figure errors of the type in Eq. (10.3.2), but may include a detailed map of 
other wavefront errors, as in the case of a mirror map for HST. With the aperture 
function specified, the solution of Eq. (10.5.1) gives U(P), the Fourier transform 
of the aberrant aperture function, and i(P) = |U(P)|*/|U(O)|?. 

The next step is to compute the Fourier transform of i(P) according to Eq. 
(11.1.6) and find the transfer function 7(v,, v,). At this point mid- or high- 
frequency errors or pointing jitter can be incorporated by multiplying T by the 
appropriate degradation functions to get the overall system transfer function T,. 
The final step to find the PSF is to take the Fourier transform of 7,. 

Imaging characteristics of diffraction-limited telescopes at visible wavelengths 
were of little more than academic interest prior to the start of the space age, with 
images from large ground-based telescopes dominated by seeing. Fortunately the 
diffraction theory of aberrations discussed in Chapter 10 was well established at 
the time when thoughts turned to designing a space observatory as large as HST. 

Diffraction theory, along with that of transfer functions, is now essential in the 
design of large ground-based telescopes. In Chapter 16 we use these theories to 
examine the effects of atmospheric turbulence on image quality and the potential 
gains from applying techniques of adaptive optics. In Chapter 18 we use transfer 
functions to represent some of the characteristics of large mirrors and their effects 
on image quality. 


Bibliography 303 


REFERENCES 


Mahajan, V. (1978). Appl. Opt., 17: 3329. 

O’Neill, E. (1963). Introduction to Statistical Optics, Reading, MA: Addison-Wesley. 

Smith, F. (1963). Appl. Opt., 2: 335. 

Wetherell, W. (1980). Applied Optics and Optical Engineering, vol. 8, Chap. 6, New York: Academic 
Press. 


BIBLIOGRAPHY 


Fourier Optics 

Goodman, J. (1968). Introduction to Fourier Optics, New York: McGraw-Hill. 
Hecht, E. (1987). Optics, second edition, Chap. 11, Reading, MA: Addison-Wesley. 
Steward, E. (1983). Fourier Optics: An Introduction, New York: John Wiley. 


Hubble Space Telescope 

Burrows, C. (1991). The First Year of HST Observations, Kinney, A. and Blades, J. (eds.), p. 96, 
Baltimore, MD: Space Telescope Science Institute. 

Instrument Handbooks. Published periodically by the Space Telescope Science Institute, Baltimore, 
MD. 

The Space Telescope Observatory (1982). NASA CP-2244, NASA, Washington, DC. 

Schroeder, D. (1985). Advances in Space Research: Astronomy from Space, 5: (3), 157. 


Chapter 12 Spectrometry: Definitions and Basic 


Principles 


Spectral analysis of celestial objects is probably the most important means for 
learning about the physics of these sources, with a large fraction of telescope time 
used to get spectral data. In this chapter we begin to consider the characteristics of 
spectrometers used with telescopes to get this data. 

We use the term spectrometer in our discussion to refer to any of several types 
of spectroscopic instruments. A spectrograph is an instrument in which many 
spectral elements are recorded simultaneously with an area detector having many 
resolution elements. A monochromator is an instrument in which single spectral 
elements are recorded sequentially in time by a detector with a single resolution 
element. Many of the results that follow apply to either type of instrument, but if 
there is a difference the distinction is noted. 

A simple method of getting spectral information is imaging with filters, broad- 
or narrowband filters placed in the beam ahead of the telescope focal surface and 
a detector. This technique is usually called filter photometry. For point sources the 
result is one piece of spectral information per image for each source in the field, 
while for extended objects there is one piece of spectral information for each 
resolution element on a 2D detector. 

More detailed spectral information is obtained if the light is sent through a 
dispersing element, such as a prism or diffraction grating. In this case a spectrum 
is obtained for each source whose light passes through the disperser, with the 
number of pieces of spectral information per source determined by the mode in 
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which the disperser is used. In the so-called slitless spectroscopy mode a prism or 
grating acts as a dispersing filter and gives a spectrum for each source in the field. 
For slit spectroscopy, when the disperser is part of a slit spectrometer, a spectrum 
is obtained for each source whose light passes through the slit. 

Two other spectroscopic modes used extensively in astronomy are multiple 
object spectroscopy (MOS) and integral field spectroscopy (IFS). When used with 
a grating spectrometer, each of these modes uses optical fibers to transfer light 
from a 2D focal surface of a telescope to the one-dimensional (1D) spectrometer 
slit. The difference between the modes is the arrangement of the fibers on the 2D 
focal surface. For MOS each fiber is set on a single source within a group of stars 
or galaxies, while for IFS the fibers are tightly packed in order to get spectra over 
an area of an extended source. 

In succeeding sections we present the basic principles that govern the 
operation of all spectrometric devices and define such terms as limit of resolution, 
spectral resolving power, etendue, and luminosity. We also discuss in more detail 
the characteristics of the various modes noted here. Another source of informa- 
tion about the principles of spectrometry is the excellent book by Meaburn 
(1976). 


12.1. INTRODUCTION AND DEFINITIONS 


Each type of spectrometer is denoted by the kind of dispersing element that is 
used, hence prism, grating, or Fabry—Perot spectrometer. The dispersing element 
is usually located between auxiliary optics that collimate the light beam from the 
telescope and focus the dispersed light onto a detector. Exceptions to this type of 
arrangement are noted in the following sections, with representative examples 
discussed in Chapter 15. 

The one type of spectrometer that does not have a dispersing element is the so- 
called Fourier transform spectrometer. This instrument is basically a Michelson 
interferometer whose output is an interferogram from which spectral information 
is derived by Fourier analysis. Because the Fourier spectrometer is not a 
dispersive device, the definitions in the following sections that include dispersion 
do not apply to this type of spectrometer. An introduction to Fourier spectro- 
meters is provided in Section 13.6. 


12.1.4. ANGULAR AND LINEAR DISPERSION 


Each type of dispersing element is characterized by its angular dispersion, 
defined as dB/d/, where df is the angular difference between two rays of wave- 
length difference då emerging from the disperser. This is shown schematically in 
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aans = dB 
À 
A+ dÀ 


Fig. 12.1. Schematic of dispersive element. Angular dispersion A = dß/då. 


Fig. 12.1 for a single ray incident on the dispersing element. Relations for angular 
dispersion of different dispersing elements are given in Chapter 13. 

The angular dispersion is clearly a parameter associated with the dispersing 
element, independent of the configuration in which it is used. When the element 
is part of an optical system, the characteristics of both are combined to define the 
linear dispersion, dl/di, where dl is the linear separation on a focal surface 
between two rays of wavelength difference dd. 

If collimated light is incident on the disperser, then the linear dispersion is 
given by 


dl/di =f dB/di = fA (12.1.1) 
where f is the focal length of the optics following the dispersing element, and A is 
the angular dispersion. This case is shown in Fig. 12.2. 

If a convergent beam of light is incident on the disperser, the linear dispersion 
is 


dl/di = s dB/di = sA (12.1.2) 


where s is the distance from the disperser to the focal surface. This case is shown 
in Fig. 12.3. 


oo dl 
+ 

dg 
FS t 


Fig. 12.2. Spectrum in focus on focal surface FS. Linear dispersion = f dB/dA. 
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Fig. 12.3. Spectrum in focus on focal surface FS with convergent light incident on disperser. 
Linear dispersion = s dB/dd. 


12.1.b. SPECTROMETER MODES; PLATE FACTOR 


Spectrometer configurations to which Eq. (12.1.1) applies include both the slit 
spectrometer and a slitless mode where a prism or grating is placed in front of a 
telescope. In the case of a slit spectrometer a separate collimator provides a 
collimated beam to the dispersing element and f is the focal length of the camera 
optics. Fabry—Perot spectrometers also have separate collimator and camera 
optics. The type of slitless mode noted here is often called the objective mode; 
in this case f is the focal length of the telescope. 

Configurations to which Eq. (12.1.2) applies include the slitless mode where a 
disperser, usually a grating or grism, is placed in a converging telescope beam 
ahead of the focal surface, either prime or Cassegrain. The grism is a combination 
of a grating and prism, with the grating as the main dispersing element. This type 
of slitless mode has been called the nonobjective mode by Hoag (see Hoag and 
Schroeder (1970)). Equation (12.1.2) also applies to the so-called Monk-Gillieson 
spectrometer, in which a mirror preceding the grating is both collimator and 
camera. 

For any of the spectrometer modes noted here it is convenient to define P, the 
reciprocal linear dispersion or plate factor, where 


P=(f4y, (12.1.3a) 
P=(sA)!, (12.1.3b) 


for the modes in Figs. 12.2 and 12.3, respectively. The units of P are usually 
given as Angstroms per millimeter or nanometer per millimeter. 
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12.2. SLIT SPECTROMETERS 


A general layout of a slit spectrometer in the most commonly used arrange- 
ment is shown in Fig. 12.4. Elements of the spectrometer include an entrance slit 
of width w and height A at the telescope focus, collimator and camera optics to 
reimage the entrance slit, and a disperser whose angular dispersion is A. 
Collimator and camera optics have focal lengths f, and f), respectively, with a 
reimaged slit of width w’ and height h’ at the camera focus. 

The entrance slit subtends angles ¢ and ¢’ on the sky and da and ôx’ at the 
collimator, where ¢ = w/f, 6’ = h/f, ôa = w/f,, and da’ = h/f,. The collimated 
beam incident on the disperser has diameter d,, with the direction of dispersion 
parallel to the slit width, or in the plane of the diagram in Fig. 12.4. 

The size of the projected slit image depends upon f, f2, and the characteristics 
of the disperser. Figure 12.5 shows the collimator and camera represented by 
equivalent thin lenses, with an object of length / subtending an angle y at the 
collimator and its image of length /’ subtending an angle y’ at the camera. For a 
system with no dispersing element between the lenses, y’ = y and 1’ = I( f/f). 
Because a system without a disperser is rotationally symmetric about the z-axis, 
these relations are true for any orientation of the object. 

If a dispersing element is placed between the lenses, rotational symmetry 
about the z-axis is lost and the equality of the subtended angles is not necessarily 
preserved for different orientations of the object. In the direction perpendicular to 
the dispersion the beam passing through the disperser is unchanged, and y’ = y 
holds as before. This is not the case in the direction along the dispersion, where it 
is necessary to take y’ = ry to account for possible magnification effects due to 
the disperser. In terms of the subtended angles in Fig. 12.4 we have r = dB/da. 

The parameter r, called the anamorphic magnification, depends on the type 
and orientation of the dispersing element. At this point we note, without 
derivation, that r = d,/d,, the ratio of the beam widths at the collimator and 
camera. This relation is derived in what follows and the form of r in terms of the 
parameters of a specific disperser is given in Chapter 13. 

Applying these results to the slit dimensions in Fig. 12.4 gives 


w’ =rw( f/f) =roDFy, (12.2.1a) 
K =h( f/f) = ¢'DFy, (12.2.1b) 


where F, = f,/d,. This definition of the camera focal ratio in terms of the 
collimator beam diameter is made to ensure that F, can be used in a meaningful 
way when discussing irradiance of detector pixels in Section 12.2. Note that the 
relations in Eq. (12.2.1a,b) also follow directly by substitution into Eq. (2.2.10) 
with w =n. 
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Fig. 12.4. Schematic layout of slit spectrometer with dispersing element of angular dispersion A. See text, Section 12.2, for definitions of parameters. 
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Fig. 12.5. Transverse magnification |m| = l/l = f,/f, in direction perpendicular to dispersion. In 
the direction parallel to dispersion |m| = r( f2/f1), where r is the anamorphic factor. 


The relation in Eq. (12.2.1a) is important in establishing the proper value of F, 
for a detector whose pixel size A is correctly matched to w’. If we take a correct 
match to be one in which two pixels cover the width w’, then 2A = r@DF). If, for 
example, we choose A = 20 um, ¢ = larc-sec, and D = 4m, then from Eq. 
(12.2.1a) we find rF, = 2. 

This match between pixel size and projected slit width is based on the Nyquist 
criterion for discrete sampling, discussed in detail in Chapter 16. For our 
purposes here we simply state that a minimum of two samples per resolution 
element are required for unambiguous resolution of images that are just resolved 
according to the Rayleigh criterion. We adopt a similar definition for limit of 
resolution and spectral resolving power in the following sections. 

From Eq. (12.2.1a) we also see that a constant w’ for a given angular slit width 
implies rDF, = constant. Thus a spectrometer on a larger telescope requires a 
camera with a smaller focal ratio, if the ratio w’/@ is to remain constant. 


12.2.4. LIMIT OF RESOLUTION AND SPECTRAL PURITY 


Consider a spectrometer entrance slit of width w illuminated by light of two 
monochromatic wavelengths 4 and 4 + A4. The slit image in each wavelength has 
width w’ and, from Eq. (12.1.1), the separation between the centers of the images 
is Al = f,AAd. We define the limit of resolution 6A as the wavelength difference 
for which A/ = w’, hence the spectral images are on the verge of being resolved 
with a detector satisfying the Nyquist criterion. The analog to limit of resolution 
for a monochromator with exit slit of width w’ observing a continuous light 
source is the spectral purity. 

Putting this condition on A/ into Eq. (12.2.1a), and using Eqs. (12.1.1) and 
(12.1.3a), gives 


dh , roD 
ô = (Ga! = Pw = Dade (12.2.2) 


where from Fig. 12.4 we find f,/d, =f/D. 
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For a given telescope diameter and angle on the sky, it is evident from Eq. 
(12.2.2) that the key factors that determine the limit of resolution are the angular 
dispersion and collimator beam diameter. We also see that putting a given 
spectrometer on a larger telescope gives a larger 6/ for the same angle on the 
sky. In order to maintain the same limit of resolution or spectral purity with a 
given type of spectrometer on a larger telescope it is necessary to keep @D/d 
constant, hence a larger spectrometer for the same angle on the sky. 

For a Monk-Gillieson slit spectrometer, usually configured as a monochro- 
mator and shown schematically in Fig. 12.6, the relations in Eqs. (12.2.1) apply if 
Jı and f are replaced by sı and s). Combining Eqs. (12.1.3b) and (12.2.1a), we 
find the spectral purity is given by Eq. (12.2.2), provided d,/r is replaced by d,, 
where d, is the beam size at the disperser. Hence for this spectrometer the spectral 
purity is set by the angular dispersion and beam size at the disperser, for a given @ 
and D. 

It is important to note here that the definition of limit of resolution is a 
geometric one that does not take into account the limit on image size set by 
diffraction. The limit of resolution cannot be smaller than 6/9, the limit set by 
diffraction, where this limit depends on the angular dispersion, collimator beam 
diameter, and wavelength. The expression for 6/9 is derived in a later section. 


12.2.6. SPECTRAL RESOLVING POWER 


The spectral resolving power 2, a dimensionless measure of the limit of 
resolution, is defined as 2 = 4/61, hence 


A 
R = — = 


12.2. 
53 (12.2.3) 
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Fig. 12.6. Schematic of Monk-Gillieson spectrometer in direction parallel to dispersion, with 
dispersing element in convergent light. 
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It is clear from Eq. (12.2.3) that a larger telescope requires a larger beam diameter 
for a given type of disperser, if the resolving power is to be kept constant. 
Because 6/ = dAp at the limit set by diffraction, there is also a largest possible 
spectral resolving power 2), although in most applications in astronomy the 
resolving power given by Eq. (12.2.3) is considerably smaller than Zp. 


12.2.c. FLUX, LUMINOSITY, AND ETENDUE 


We now seek a relation for the energy flux transmitted by the telescope- 
spectrometer combination, with this relation expressed in terms of the system 
parameters. This analysis leads to a quantity called the etendue, a quantity whose 
importance in the analysis of spectrometric instruments was first emphasized by 
Jacquinot (1954). A second and related quantity introduced is the luminosity, 
defined as the product of the etendue and net transmittance of the optics. 

The derivation of etendue given here makes use of basic photometric 
definitions given by Born and Wolf (1980). Consider a small, uniformly radiating 
surface element of area dS and photometric brightness B, as shown in Fig. 12.7. 
The flux dF radiated into a small cone of solid angle dQ in a direction 0 from the 
normal to dS is given by 


dF =I dQ = Bcos dS dQ, (12.2.4) 


where / is the intensity or flux per unit solid angle in the direction 0. For a 
spectrometer we take dS as the area of the entrance slit. 

The flux passing through the entrance pupil of the spectrometer, taken at the 
collimator, is the integral of Eq. (12.2.4) over the aperture of the pupil. For the 
annular cone shown in Fig. 12.8, the solid angle between 0 and 0 + d@ is given by 
dQ = 2r sin 0 d0, and therefore 


Om 

F =2nB as| cos 6 sin 0 d@ = BU, (12.2.5) 
0 

U = nsin? 6,, dS, (12.2.6) 


where tan 6,, = d,/2f,, U is the etendue, and dS = wh. 


dF 
dQ 


8 
dS 


Fig. 12.7. Surface element of area dS and brightness B radiating into solid angle dQ. See Eq. 
(12.2.4). 
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Fig. 12.8. Flux entering spectrometer entrance pupil. See Eq. (12.2.5). 


Assuming 6,, is small we can replace sin 0,, by Om. Substituting for w, h, and 
6,, in terms of the telescope and spectrometer parameters gives 


2 
U= T aa ba! = D gy = SQ. (12.2.7) 


In this form we see that U is the product of the collimator (telescope) area S and 
the solid angle Q subtended by the slit at the collimator (or by the slit on the sky). 

It is also instructive to write U = xw6,, - hOn. When written in this form, we 
see from the discussion following Eq. (2.2.10) that U/z is the product of two 
Lagrange invariants with n = 1. Because the Lagrange invariant is unchanged 
through an optical system, the etendue is also a constant of the system. Assuming 
n = 1 for the space surrounding the image, we can write U for the image as 


mare sp ôB’, (12.2.8) 


where 6f and 68’ are the angles subtended by the projected slit at the camera, as 
shown in Fig. 12.4. Given 68 = r da and 68’ = da’, equating Eqs. (12.2.7) and 
(12.2.8) gives the anamorphic magnification r = d} /d,, as stated here. 

Taking t as the transmittance of the telescope-spectrometer combination, the 
luminosity Z of the system is given by 


L = tU = SQ. (12.2.9) 


U= 


If F; is the monochromatic flux incident on the telescope, the flux F in the 
projected image of the slit is 


F =t¥;=tBU =B'U. (12.2.10) 


From Eq. (12.2.10) we see that the brightness B’ of the image is less than the 
source brightness B by the factor t. It is worth emphasizing that Eq. (12.2.10) is 
restricted to the case where n = 1 for the image space. If the image space index is 
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n, as it is in solid or semisolid cameras, then B’ = tn?B. As shown in Section 7.5, 
the focal length of solid cameras is n times smaller than that of the equivalent air 
camera, and therefore 68 and 58’ in Eq. (12.2.8) are each n times smaller. Thus 
the image area is n? smaller, and the relation between F and F, in Eq. (12.2.10) 
is unchanged, as must be true for conservation of energy. 


12.2.d. LUMINOSITY-RESOLUTION PRODUCT 


The importance of the concepts of etendue and luminosity in evaluating 
spectrometer performance is particularly evident in the product of either with the 
spectral resolving power. Taking the product of Eqs. (12.2.3) and (12.2.7) gives 
the luminosity-resolution product as 


LR = (tn/4)\(Dd')(AAd>), (12.2.11) 


where d, has replaced d,/r and the factors in the right-hand parentheses are 
specific to the spectrometer. For stellar sources ¢’ is the diameter of the seeing 
disk, whereas for extended sources ¢’ is the angular height of the entrance slit. 

For constant ¢’ the product ZA is a constant for a given telescope-spectro- 
meter combination. Increasing this product for a given telescope requires either a 
larger spectrometer beam diameter or a disperser with higher angular dispersion. 
Note that the width of the entrance slit does not appear in Eq. (12.2.11), hence 
higher resolving power implies lower luminosity, and conversely, as the slit width 
is changed. Meaburn has evaluated the YF product for a variety of spectro- 
meters, including prism, grating, and Fabry—Perot instruments. We evaluate this 
product for a selected set of instruments in Chapter 13. 

If Eq. (12.2.11) is multiplied by B, the source brightness, the result is a flux- 
resolution product F R. This product is more useful than Eq. (12.2.11) when ¢’ 
is not a constant. The relation for B at the entrance slit depends on the source. We 
assume 


B (extended source) = constant, (12.2.12a) 
B (stellar source) = C'/¢”, (12.2.12b) 


where C” is a constant. For simplicity we assume that the stellar image is square 
and uniformly illuminated. 

For extended sources the LA and F Z products increase as ¢’ increases, but 
the brightness B’ of the image is unchanged. If the image covers many detector 
elements, the exposure time to a given signal level is also the same. 

For stellar sources we get 


FR = tC(D/$')(AAd>), (12.2.13) 
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where C = nC'/4. When the star image overfills the entrance slit, d < $’ and 
better seeing (smaller $’) means a larger FR product. When the star image is 
entirely within the slit, Z is constant and @ is inversely proportional to ¢’. 

A reevaluation and extension of the luminosity-resolution product has been 
done by Vaughnn (1994). He points out that a single “A number may not 
accurately describe a given system, and that field-dependent factors should be 
taken into account. The interested reader should consult his paper. 


12.2.e. SPECTROMETER SPEED AND PIXEL IRRADIANCE 


The exposure time required to record a spectrum depends on the rate at which 
energy in a given spectral band is collected in a given area on the detector. For a 
spectrometer the irradiance E of an image is defined as the spectral flux received 
at the detector per unit area. Taking the flux F as the flux per unit wavelength 
interval, the spectral flux in width w’ is F 6A with 6A = w'P from Eq. (12.2.2). 
Therefore the irradiance E is given by 


E=F dilwh = FPV, (12.2.14) 


where P is the plate factor. This relation for E is identical to one for a quantity 
Bowen (1952) called speed. It is obvious that greater speed or irradiance means 
shorter exposure times. 
Taking B from Eqs. (12.2.12), we find F from Eqs. (12.2.7) and (12.2.10). 
The results are 
F (extended) = CrD’$¢’, (12.2.15) 


F (stellar) = CrD*(/¢’), (12.2.16) 


where C = nC'/4 and ¢ = ¢@’ if the stellar source is entirely within the slit. 
Substituting F into Eq. (12.2.13) and using Eqs. (12.2.1) and (12.2.2) gives 


_ ct 6A = CtiwP 














= = , 12.2.1 
e rF? rF? i i 
DP 
per Bek ERE. iiin (12.2.18) 
r(F o ) Fro 
2 
_ Cr 6A _ CtD Ir: (seeing-limited) (12.2.19) 


= Rp w 


where E, and E, denote extended source and star, respectively. 

It is also important to give relations for pixel irradiance, defined as the spectral 
flux per pixel. Given the definition of irradiance in Eq. (12.2.14), the pixel 
irradiance is simply E A’, where A? is the area of a single square pixel. As noted 
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following Eqs. (12.2.1), the proper match between pixel size and projected slit 
width is 2A = w’. If more than two pixels spans w’ the spectral image is 
oversampled, the pixel irradiance is smaller than in the properly matched case, 
and the limit of resolution is set by w’. If fewer than two pixels spans the projected 
slit, the image is undersampled and the limit of resolution is set by the detector. 

For stellar spectra in the slit-limited case, it is evident from Eq. (12.2.18) that 
better seeing means greater speed, in part because more light passes through the 
slit and in part because the image height h’ is shorter. In the seeing-limited case 
all of the light passes through the slit and, given w’ œ ¢’, speed increases only in 
inverse proportion to improved seeing. For extended sources it is evident from 
Eq. (12.2.17) that seeing has no effect on speed. 

It is important to note the dependence of speed on telescope diameter and 
camera focal ratio. For extended sources we see that speed is independent of 
diameter, and greater speed requires a faster camera. In the seeing-limited case for 
stellar sources, the speed is proportional to the telescope area. For stellar sources 
in the slit-limited case, the most usual situation with spectrometers on large 
telescopes, the speed is proportional to diameter and inversely proportional to the 
camera focal ratio. 

It was pointed out by Bowen that scaling the size of a spectrometer in direct 
proportion to the telescope diameter does not change the speed, at the same limit 
of resolution. This is easily shown by noting that ¥/w’h’ is independent of D. 
Hence an increase in speed can only be achieved by using a spectrometer camera 
with a smaller focal ratio. This is one of the major reasons why much effort has 
gone into the design and construction of fast cameras. 

In our preceding discussion we have assumed a stellar image with uniform 
brightness, rather than a more realistic one with a bright center and fainter 
surrounding halo. The results found using a realistic profile are essentially the 
same as those of the foregoing, and little is gained by introducing this refinement. 
For further discussion of spectrometer speed, the reader should consult the 
reference by Bowen (1952). 


12.2.f. CONCLUDING REMARKS 


We have already noted here that many spectroscopic observations of stellar 
sources, especially with large telescopes, are made in the slit-limited mode with 
the star image wider than the slit. Compensation for atmospheric seeing with 
adaptive optics makes it possible to obtain higher spectral resolving power 
without losing an undue amount of light at the slit. Taking full advantage of 
such techniques requires, of course, careful attention to detector pixel size. 

In the absence of adaptive optics techniques it is possible to recover much of 
the light intercepted by the slit jaws with so-called image slicers. Such a device, in 
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effect, slices the image into several strips and places these end- to-end along the 
length of the slit. For an excellent discussion of the principles of image slicers 
and their practical realization, the reader should consult the reference by Hunten 
(1974). 


12.3. FIBER-FED SPECTROMETERS 


The relations derived for slit spectrometers in the previous section are 
generally applicable to fiber-fed spectrometers, but only after one important 
difference is factored into the equations. This difference is that of focal ratio 
degradation (FRD), a factor discussed in Section 9.6. In our following discussion 
we assume that the output end of the fiber is positioned at the entrance slit of a 
spectrometer and that the slit width is the same as for the spectrometer without 
the fiber. 

Consider first an existing slit spectrometer built to match a given telescope. 
The beam from the telescope fills the collimator, as shown in Fig. 12.4, and 
F, =f\/d, =F =f/D. When a fiber is introduced, the effect of FRD is to 
expand the beam, hence F) < F, and lose light directly at the overfilled 
collimator. To avoid this light loss, a spectrometer with a larger d, and smaller 
F, can be built. Assuming the projected slit width w is to remain constant, this 
requires a smaller F,, hence a faster camera and generally a more difficult design. 

It is important to note that this larger spectrometer, built specifically to recover 
lost light, does not have a larger spectral resolving power, as might be expected 
from Eq. (12.2.3). Given d, = f,/F, and D = f /F, we can rewrite Eq. (12.2.3) as 


_(Fi\44 a _ Mf 
a-(2)4 9-34. (12.3.1) 


This relation shows clearly that holding f, constant and increasing d, does not 
increase Z. 

The etendue of the telescope is unchanged by the introduction of a fiber, hence 
U given in Eq. (12.2.7) is unchanged. It is left to the reader to take Eqs. (12.2.7) 
and (12.3.1) and form the different resolution products given in Section 12.2 for 
the slit spectrometer. The outcome of this exercise is that the luminosity- 
resolution and flux-resolution products can only decrease when a fiber is 
introduced. 

This decrease in efficiency of a fiber-fed spectrometer for a single source is 
more than regained when multiple fibers are used. Individual fibers on sources on 
a telescope focal surface are deployed side-by-side along a spectrometer slit and 
the spectra of many sources are obtained simultaneously. With dozens of fibers so 
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arrayed, the gain in observing efficiency is significant. We discuss some of the 
optical characteristics of fiber-fed spectrometers in Chapter 15. 


12.4. SLITLESS SPECTROMETERS 


Based on the discussions in Section 12.2, the relations for slitless spectro- 
meters are easily found. The major differences for the slitless mode are: (1) the 
image size of a stellar source is set by atmospheric seeing or diffraction rather 
than a slit; (2) the anamorphic magnification is one in all practical configurations; 
and (3) the diameter d; is the beam size at the dispersing element. 

Thus the relations in Section 12.2 apply to slitless configurations if @ is 
replaced by ¢’ and r is set equal to one. With these changes the limit of resolution 
is given by 


EA (12.4.1) 


where f is the telescope focal length and d, is the diameter of the dispersing 
element. In the objective mode, as shown in Fig. 12.2, s =f and d, = D. The 
irradiance of an image with spectral band 6A is given by Eq. (12.2.19). 


12.5. SPECTROMETERS IN DIFFRACTION LIMIT 


We noted in Section 12.2 that the relation for the limit of resolution did not 
take into account the limit on image size set by diffraction. In this section we 
determine the form of the spectrometric parameters in the case where a stellar 
image in the focal plane of a diffraction-limited telescope is the effective entrance 
aperture for a perfect spectrometer. 

One important characteristic of a perfect image from a telescope with an 
annular aperture is the radius of the first dark Airy ring, given by Eq. (10.2.9) for 
a clear aperture. This radius, in angular units, is also the limit of resolution 
according to the Rayleigh criterion, as given by Eq. (10.2.22). We now assume 
the effective width and height of the entrance aperture are equal to this radius, 
hence 


w=h=1.22y/F, (12.5.1) 

where 1.22y = w, from Table 10.1. Substituting Eq. (12.5.1) in Eqs. (12.2.1) 
gives 

w = 1.22yr1F, & ri Fy, (12.5.2a) 

k = 1.22yAF, & AP. (12.5.2b) 
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From this point on we take the approximate relations given in Eq. (12.5.2a,b). We 
now apply the relations in Section 12.2 to find the spectrometric parameters for a 
diffraction-limited telescope-spectrometer system. The relation in Eq. (12.5.2a) 
can be used to find rF, for a detector whose pixel size A is matched to w’. 
Assuming 2A = w’, as done following Eq. (12.2.1), we find rF, = 80 for a pixel 
size of 20pm with 2 = 500nm. It is evident from this value of rF, that a 
spectrometer on a diffraction-limited telescope has no need for a Schmidt-type 
camera in the visible or near-infrared spectral regions. 

Following the procedure in Section 12.2.a we take A7 = w’ as the separation 
between two monochromatic images that are resolved by the Rayleigh criterion. 
The difference in wavelength between these two images, according to Eq. 
(12.1.1), is given by 61) = Al/f,A. Therefore 





rAF, rà A 
Ôo = = = ; 12.5.3 
Ow fA Ad) Ad, (293) 
Ro = (12.5.4) 


Note that the changes in Eqs. (12.2.2) and (12.2.3) to get the relations for the 
diffraction-limited case are simply a substitution of 4 for @D. Following this 
procedure, and noting that ¢’ = ¢, we can transform the remaining parameters in 
Section 12.2 into their diffraction-limited counterparts. The results are 


U, = A (12.5.5) 
(LAh = to P Ady, (12.5.6) 
(FA) = CtD*Ady, (12.5.7) 


_ CID? ôho _ CrD?P 
(AF) AR 





5 (12.5.8) 


The relations in Eqs. (12.5.5)—(12.5.8) apply only to stellar sources, with Eq. 
(12.5.8) the counterpart of Eq. (12.2.19). 

It is possible to make various comparisons between the relations for diffrac- 
tion-limited and seeing-limited cases, and we give two such comparisons. Taking 
the ratio of Eqs. (12.5.7) and (12.2.13) gives 


(FRJ _ TDH 
FR aD’ 
where the parameters with subscript 0 refer to the diffraction-limited system. 
Consider the 2.4-m Hubble Space Telescope and a ground-based 8-m telescope. 
Taking tọ = T, ọ' = 0.5 arc-sec, and 1 = 500 nm, gives an FẸ ratio of 3.6. A 
complete analysis of the detectability of a faint source must take into account the 








(12.5.9) 


320 12. Spectrometry: Definitions and Basic Principles 


sky background and detector characteristics; we defer this discussion including 
signal-to-noise ratio (SNR) to Chapter 16. 

From Eqs. (12.2.3) and (12.5.4) we find the ratio of the resolving powers 
R/Ro = 4/oD. For visible wavelengths and excellent seeing conditions, say 
0.5 arc-sec, and D > 1m, we find that 2 is small compared to Ay. Thus the 
resolution at visible wavelengths at which spectrometers on ground-based 
telescopes are often used is well below the resolution that is theoretically possible. 
At infrared wavelengths, where A is larger and seeing is typically better, it is 
possible that 2 can approach &p. 
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Chapter 13 Dispersing Elements and Systems 


We now turn our attention to specific dispersing elements, discussing in turn 
the prism, diffraction grating, and Fabry-Perot interferometer. We give expres- 
sions for the angular dispersion and resolving power for each of these, and point 
out other characteristics that are important in their application. Finally, we also 
discuss the characteristics of the Fourier spectrometer. 


13.1. DISPERSING PRISM 


The angular dispersion A of a prism used at minimum deviation is derived in 
Section 3.2. A dispersing prism is generally used at or near minimum deviation, 
defined as that orientation at which @ in Fig. 3.6 is a minimum and rays inside the 
prism are parallel to the base. From Eq. (3.2.8) we have 


= (13.1.1) 


where t is the base length, a the beam width into and out of the prism, and dn/dA 
the rate of change of index with wavelength. Because a is the same on either side 
of the prism, there is no anamorphic magnification and r = 1. 
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Substituting A into Eq. (12.4.4), and noting that d, = a, gives the limiting 
spectral resolution of a single prism as 


Ry = t dn/da. (13.1.2) 


If there are k identical prisms in series, then Eqs. (13.1.1) and (13.1.2) are each 
multiplied by k. 

Dispersion curves for three glasses selected from the Schott glass catalog are 
shown in Fig. 13.1. The form of these curves is typical of those for all transparent 
glasses, with dn/dA going approximately as the inverse cube of the wavelength. 
Taking UBK7 at 2 = 500 nm, as an example, we find dn/d/ = 0.066 pm', and 
thus @) = 10,000 for t = 150mm. This base length corresponds to a beam 
diameter of about 100mm for a 60° prism or about 1400 mm for a 6° prism. 

Prisms are often used in the objective mode on Schmidt telescopes in the 1-m 
class to obtain spectra suitable for classification. In this configuration 4/@D is 
approximately 0.1 and therefore Æ = 1000 for a UBK7 prism whose base is 
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Fig. 13.1. Dispersion curves for three glasses from Schott glass catalog. 
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150mm. This resolution is sufficient for this mode where the plate factor P is 
typically about 150 A/mm. 

As we show in Section 13.2, the resolution of a prism is low compared to what 
is possible with a grating large enough to accept the same beam diameter. For this 
reason, prisms are little used as primary dispersing elements in slit spectrometers, 
having been largely replaced by gratings. Prisms are, however, often used as 
cross-dispersers in spectrometers that use gratings for the primary dispersion. For 
the objective mode, a prism is more efficient than a grating, for reasons made 
clear when we discuss grating efficiency. 


13.2. DIFFRACTION GRATING; BASIC RELATIONS 


The diffraction grating is the primary dispersing element in most astronomical 
spectrometers, where it has the advantage of significantly larger spectral resolving 
power than a prism of comparable size. A grating is also versatile in the spectral 
formats it can provide and can be quite efficient over a reasonable spectral range, 
though usually not as efficient as a prism. We discuss these and other character- 
istics of gratings in this section and the following one. A number of additional 
references that discuss gratings are given at the end of the chapter; the excellent 
book by Loewen and Popov (1997) is especially recommended. 


13.2.4. GRATING EQUATION 


The starting point for our discussion of gratings is the well-known grating 
equation, the derivation of which is found in any introductory optics text. For the 
usual case of a chief ray in the xz-plane, with the grating grooves parallel to the y 
axis in the yz-plane, the grating equation is 


mA = o(sin f + sing), (13.2.1) 


where m is the order number, a is the distance between successive, equally spaced 
grooves or slits, and « and f are angles of incidence and diffraction, respectively, 
measured from the normal to the grating surface. The parameter ø is also called 
the grating constant. The plus sign in Eq. (13.2.1) applies to a reflection grating; 
the minus sign to a transmission grating. We derive the general form of this 
equation from the point of view of Fermat’s Principle in Chapter 14. 
Schematics of grating cross sections are shown in Fig. 13.2 for both a 
reflection and a transmission grating. The plane defined by the incident ray and 
normal to the grating surface is in the plane of the paper in Fig. 13.2, with the 
grating grooves perpendicular to this plane. The angles « and f are governed by 
the same sign convention as that given for i and 7’ in Chapter 2. For a reflection 
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(a) (b) 


Fig. 13.2. Schematic showing angles of incidence « and diffraction £ for (a) reflection grating and 
(b) transmission grating. See the discussion following Eq. (13.2.1) for sign convention. 


grating œ and f have the same signs if they are on the same side of the grating 
normal, while for a transmission grating they have the same signs if the diffracted 
ray crosses the normal at the point of diffraction. In each of the diagrams in Fig. 
13.2, « and $ have the same signs. Note that m = 0 for a reflection grating when 
x = —ß, while for a transmission grating this condition holds when « = £. 


13.2.b. ANGULAR DISPERSION 


The angular dispersion follows directly from Eq. (13.2.1) by holding « 
constant and differentiating with respect to A, with the result 


dp m 





= = i; 13.2.2 
$ dà acosp ( a) 
or 
sin $ + sin 
= ——__—_ 13.2.2b 
Acos B ( ) 


Unless otherwise noted, the following discussion refers to reflection gratings, 
hence the plus sign in Eq. (13.2.2b). 

From Eq. (13.2.2a) we see that angular dispersion in a given order m is a 
function of o and f. When looked at from the point of view of this equation, 
changing A means choosing a grating with a different grating constant and/or 
using the grating at a different angle of diffraction. 

We see from Eq. (13.2.2b) that A, at a given wavelength, is set entirely by 
the angles « and £, independent of m and o. Thus a given angular dispersion can 
be obtained with many combinations of m and ø, provided the angles at the 
grating are unchanged and m/c is constant. Recognition of this fact led to the 
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development of coarsely ruled reflection gratings specifically designed to achieve 
high angular dispersion by making « and $ large, typically about 60°. Such 
gratings, called echelles, have grooves densities in the range of 300 to 30 per mm 
with values of m in the range of 10 to 100 for visible light. On the other hand, 
typical first- or second-order gratings have groove densities in the range 300 to 
1200 per mm. Low-order gratings are often called echelettes to distinguish them 
from echelles. A typical echelle and an echelette grating used in first order are 
compared in a later section. 


13.2.c. ANAMORPHIC MAGNIFICATION 


The relation for anamorphic magnification is derived from Eq. (13.2.1) by 
holding À constant and finding the change in £ for a change in «. The result is 
cos B dB + cosa da = 0, and it follows that 

_ |dB| cosa dı 


t= a ak ae (13.2.3) 





The relation between the beam widths and angles is derived from the geometry in 
Fig. 13.3, with d,/ cosa = d,/ cos B. 

Because r is in the denominator of the resolving power given in Eq. (12.2.3), it 
is clear that the choice r <1, hence f <a, gives higher resolution. This 
condition, in turn, means that the grating normal is more nearly in the direction 
of the camera than of the collimator. If the grating is to accept all of the light from 
the collimator, it follows that W = d,/cos« is required, where W is the width of 





Fig. 13.3. Change in beam width due to anamorphic magnification of grating. See Eq. (13.2.3). 
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the grating. In the direction parallel to the grating grooves, the height of the 
grating is given by H = d}. 

For a typical echelette the value of r is not far from unity and has little effect 
on the spectral resolving power. In the case of an echelle the value of r has a 
significant effect on the resolving power, as our example to follow demonstrates. 


13.2.d. SPECTRAL RESOLVING POWER 


Substituting Eq. (13.2.2a) into Eq. (12.4.4) gives the spectral resolving power 
in the diffraction limit as 


d Ww 
Ry = 2 =” -> mN, (13.2.4a) 
acos B (o 





where N is the total number of grooves in the grating width W. The relation 
between W, d,, and f is shown in Fig. 13.3. Writing Eq. (13.2.4a) in terms of 
angles, we get 


Ay =% (sinB + sino). (13.2.4b) 


From Eq. (13.2.4b) we see that Zp is directly proportional to the grating width for 
a given pair of angles. Using the geometry in Fig. 13.3, we also see that the 
numerator in Eq. (13.2.4b) has a simple geometric interpretation; it is the total 
path difference between the marginal rays spanning the grating width, and %% is 
the number of wavelengths in this path difference. 

The resolving power for the seeing-limited case is obtained by replacing 4 in 
Eq. (13.2.4b) by @D, and is 


wW 


A= 6D 


(sin $ + sina). (13.2.5) 


Given that 2 x W/D, it is evident from this relation why there has been a 
concerted effort to produce larger reflection gratings and echelles, as the size of 
telescopes has increased. 

For ease of reference, the important grating relations are brought together in 
Table 13.1. 


13.2.e. FREE SPECTRAL RANGE 


For a given pair of « and £, the grating equation is satisfied for all wavelengths 
for which m is an integer. Thus there are two wavelengths in successive orders, 
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Table 13.1 


General Equations for Reflection Gratings 


md = o(sin B + sina) 


ap m sin $ + sina 





A= = = 
dà acosB Acos B 
& = W(sin es sina) 
a= W (sin B + sina) 
gD 


A and 2’, for which we have md’ = (m + 1)4. The wavelength difference 
Ad = X — 1 is called the free spectral range, where 


A=W — A = å/m. (13.2.6) 


The two wavelengths are diffracted in the same direction and confusion is the 
result unless one is rejected by a filter or they are separated with a cross- 
disperser. Both techniques are used to eliminate the wavelength overlap, with a 
cross-disperser most often used when m is large and a filter when m is small. 


13.3. ECHELLES 


Echelles are diffraction gratings with relatively large groove spacings used in 
high order at large angles of incidence and diffraction. It is evident from Eq. 
(13.2.2b) that angular dispersion is larger for large angles « and f, hence a given 
pair of spectral lines is more widely separated in the camera focal plane. This 
means, in turn, that the resolving power is larger for a given grating width, as is 
evident from Eqs. (13.2.4b) and (13.2.5). Because an echelle is used in high 
orders, the free spectral range is small compared to that of an echelette, as shown 
by Eq. (13.2.6). The resulting spectrum, therefore, is typically one of many 
orders. 


13.3.4. COMPARISON OF GRATING AND ECHELLE 


As illustration of the relations in Section 13.2, we now consider two specific 
gratings, an echelle and an echelette, and give their characteristics at a wavelength 
near 500nm. The reflection gratings chosen are assigned the following para- 
meters: 
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m 1/o (mm) ô (°) 
Echelette 1 1200 17.5 
Echelle 45 79 63.5 


a=ô+0, B=6-8, 6>0ifa> Zp 


The reader can verify for each that the grating equation is satisfied for a 
wavelength near 500nm when 0 = 0. The parameters chosen are typical of 
those for a first-order grating and echelle. 

The angle 6, the so-called blaze angle, is introduced here because it is one of 
the key grating parameters; its significance is discussed fully in Section 13.4. At 
this point we simply note that 6 is the angle between the plane of the grating and 
the plane of a single groove. We also note that grating efficiency is a maximum 
when « and f are chosen as given, relative to the blaze angle. A sketch of the 
geometry showing the relation between these angles is shown in Fig. 13.4. 

It is useful at this point to write the relations in Table 13.1 for the special case 
where 0 = 0, hence a = f = ô. This defines the so-called Littrow configuration, 
for which we get the results given in Table 13.2. Although the strict Littrow 
configuration is not a practical one for a reflection grating, the angle 0 is small in 
most grating spectrometers. Thus the relations in Table 13.2 are useful for 
calculating good first approximations to the true values, and comparisons of 
different gratings is easily made. 

For our chosen grating and echelle, 2 and 2, in the Littrow mode are 6.36 
times larger for the echelle, at the same d, and A. It is conventional to describe an 
echelle by its R-value, where R = tan ô. For our example we have chosen an R-2 
echelle. It is also worth noting that W is 2.14 times larger for the echelle in the 
Littrow configuration, as compared with the echelette. 





Fig. 13.4. Relation between blaze angle 6, grating normal GN, and angles of incidence and 
diffraction. 
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Table 13.2 


Equations for Reflection Gratings in 
Littrow Configuration® 


mA = 20 sind 
dp m 2 tan ô 











ane igre ae A 
R 2Wsind 2d, tand 
Oe AE OR 
2d, tan ô 
R= JD 
"y =ß=ð. 


Values derived from relations in Table 13.1 are given in Table 13.3 for our 
example grating and echelle, with d) = 100mm, ¢ = | arc-sec, and D = 4m. 
From the entries in Table 13.3 we see that 2 of the grating changes very little 
with changing 0, while for the echelle the change is proportionately much larger. 
The trend in the values of W/d, follows a similar pattern. 

Compared to #) of the UBK7 prism given in Section 13.1, we see also from 
Table 13.3 that the grating and echelle have values that are 12.5 and 80 times 
larger. These increases in resolution potential and achievable resolving power for 
a given slit width are obviously significant. 


Table 13.3 


Characteristics of Typical Grating and Echelle? 


Resolution in Littrow configuration 
Grating Echelle 





Ry 1.3E5 8.0E5 

R 3.1E3 2.0E4 

Parameters in non-Littrow configuration 

Grating Echelle 

0C) @DR/d, W/d, oDR/a, W/d, 
0 0.630 1.05 4.01 2.24 
2 0.637 1.06 4.31 2.41 
4 0.644 1.07 4.67 2.61 
6 0.652 1.09 5.08 2.86 


“Parameters of grating and echelle are given in text: 
d, = 100; ġ = 1 arc-sec; and D = 4m. 
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The increase in resolution 2 with increasing positive 0, and thus a > f, is a 
result that merits further comment. This result is a bit surprising at first glance 
because the angular dispersion, according to Eq. (13.2.2a), decreases with 
increasing 0. At the same time, however, the anamorphic magnification r 
decreases at a faster rate. Hence #, which is proportional to A/r, has the 
behavior as already noted. We discuss the consequences of this further in terms of 
the ZZ product in a later section. 


13.3.b. IMMERSED ECHELLE 


Among the various possibilities for increasing the resolving power, a topic we 
discuss in the following section, another way of increasing # is to immerse the 
ruled surface of the echelle in a material of index n. An example of an immersed 
echelle is shown in Fig. 13.5 where a prism is used to couple the incident light 
beam to the echelle. As shown in Section 14.1, the grating equation for an 
immersed grating is 


mA = no(sin B + sina). (13.3.1) 
The angular dispersion within the material of index n is given by 


m sin $ + sina 


= = 13.3.2 
no cos f Acos B ( ) 





For an echelle with a given blaze angle 6, we see from Eq. (13.3.1) that the order 
m for an immersed echelle is n times larger than for the same echelle in air. From 
Eq. (13.3.2) we see that the angular dispersion within the prism is independent of 
the index n. However, the diffracted light emerging from the prism is deviated an 
additional amount. For the case shown in Fig. 13.5, with diffracted rays at near 
normal incidence on the prism face, an angular difference of 68 within the prism 
is increased to ndB upon emergence. Thus A following the prism is n times A 
within the prism. 

AS a consequence of these comments on m and A, Eq. (13.2.4a) is correct as 
written, but Eqs. (13.2.4b) and (13.2.5) must be multiplied by n, hence 2o is n 
times larger for an immersed echelle in the configuration shown in Fig. 13.5, 
compared to the same echelle in air. 

Dekker has shown that, for a given beam size, an additional gain is possible if 
the angle of incidence at the prism is large rather than near zero. The effect on the 
beam is one of anamorphic magnification of the incident beam. The interested 
reader should consult the article by Dekker (1987) for details. 
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Fig. 13.5. An immersed echelle with a prism to couple the incident light beam to the echelle. 


13.3.c. COMMENTS ON LARGER RESOLVING POWER 


One of the consequences of the relations for resolving power in Tables 13.1 
and 13.2, given 2 œ d,/D, is the requirement for larger gratings and echelles on 
larger telescopes, simply to keep the resolving power constant. Not surprisingly, 
there is continual pressure to increase 2, even for larger telescopes, as demands 
are felt for more detailed spectroscopic study of astronomical sources. 

The available parameters for increasing # of an echelle on a telescope of a 
given D are larger beam size, larger R-value, narrower entrance slit, and 
immersion of the diffracting surface. A consequence of the first two factors is 
larger echelles, but practical limits exist on the maximum size that can be 
produced by the standard method of ruling with a diamond. The largest available 
ruled width W for a single echelle is about 400 mm, giving an unvignetted beam 
diameter of about 140 mm for « = 69.5° at an R-2 echelle. If some vignetting is 
acceptable at the ends of this echelle, then beam diameters of up to 200 mm are 
possible. For still larger beams it is necessary to use a mosaic of echelles, a topic 
we discuss further in Section 13.4.b. 

A consequence of a narrower entrance slit is loss of light, unless seeing quality 
is improved by adaptive optics or active control of the thermal environment in the 
immediate vicinity of the telescope. The final factor, that of immersion of the 
tuled surface, requires a prism of excellent optical quality. This is especially 
important because the beam passes through the prism twice. 


13.4. GRATING EFFICIENCY 


The absolute efficiency of a grating is defined as the fraction of the energy at a 
given wavelength incident on the grating directed into a given diffracted order, 
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where the fraction of energy diffracted into a given order is determined by the so- 
called blaze function. In this section we examine the characteristics of the grating 
blaze function. 

An exact treatment of grating efficiency is carried out by applying Maxwell’s 
equations of electromagnetic theory to the interaction of light with a grating 
surface. For our purposes the exact theory is beyond the scope of our treatment 
and a scalar approximation is used instead. The polarization of the incident light 
is ignored in the scalar theory, but the results obtained from it are a good first 
approximation to the exact results. Important differences between the two 
approaches are noted. For a discussion of the exact theory, the reader should 
consult the references given at the end of the chapter. An especially thorough 
discussion of the results derived from the exact theory are found in the book by 
Loewen and Popov (1997). A good introduction to the results of scalar theory can 
be found in a handbook published by Milton Roy (1994). 


13.4.4. BLAZE FUNCTION 


Consider a reflection grating consisting of a number N of equally spaced 
facets, or grooves, of width b with center-to- center spacing a, as shown in Fig. 
13.6. For a beam of collimated light incident at angle «, the normalized intensity 
of the diffracted wave is 


sinNv’\? /sinv\? 
ila, P =1F-BF = ( ) (=) (13.4.1) 


N sin v’ v 





b 


o 


Fig. 13.6. Schematic of unblazed reflection grating with groove width b and separation a. 
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where 2v’ is the phase difference between the centers of adjacent grooves and v is 
the phase difference between the center and edge of one groove. Equation 
(13.4.1) is composed of two parts, an interference function IF and a blaze 
Junction BF, with each function having a maximum value of unity. The derivation 
of Eq. (13.4.1) is found in any introductory optics text. 

The relations for the phase differences are 


2v = 772 (sin B+ sina), (13.4.2) 
v= T (sin p + sina). (13.4.3) 


Note that Eq. (13.4.2) is & times the path difference between rays from successive 
grooves. The interference function is a maximum when v’ = mz, where m is an 
integer. Substituting v’ = mz in Eq. (13.4.2) gives the grating equation (13.2.1). 

The blaze function BF is the normalized intensity of a single slit diffraction 
pattern and is a maximum when v = 0, hence « = —f, corresponding to m = 0 in 
the grating equation. The derivation of this diffraction pattern is found in Section 
10.1. As noted there, the first minimum in this pattern occurs at v = z. We also 
find that BF = 0.405 when v = 2/2 and that the angular width at this intensity 
level is A/b. 

A sketch of the intensity pattern for a single wavelength is shown in Fig. 13.7, 
where it is evident that the pattern is simply the interference function IF 
modulated by the blaze function BF. From Fig. 13.7 we see that this grating 
directs most of the light to zero order and thus the efficiency in any other order is 
low. 

In order to increase the efficiency in a dispersed order it is necessary, in effect, 
to move the blaze function along the axis in Fig. 13.7 until its peak coincides with 
an interference maximum in the dispersed order. This is done by tilting each facet 





Fig. 13.7. Intensity pattern of single diffracted wavelength for grating in Fig. 13.6. BF, blaze 
function; IF, interference factor. 
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Fig. 13.8. Reflection grating of Fig. 13.6 with tilted facets to shift blaze function BF by angle 20. 
FN, facet normal; GN, grating normal. 


of the grating by an angle 6 relative to the surface, as shown in Fig. 13.8. For a 
tilted groove the phase difference from center to edge is 








i (sinf — sin 0’) 
2 > o fain E EE E (13.4.4) 


where the width b = ø cos ô for a groove profile with right angle corners. 

The shifted blaze function BF is a maximum when v = 0, and from Eq. 
(13.4.4) we get « + 8 = 2ô at the blaze peak. From Fig. 13.8 we get « = ô + 0 
and B = 6 — 60’, where 0’ = 0 at the peak of the blaze function. The wavelength at 
the peak of the blaze is called the blaze wavelength i,. At this wavelength Eqs. 
(13.2.1) and (13.2.2b) become 


mA, = 20 sin ô cos 0, (13.4.5) 
2 sin ô cos 0 
ETE (13.4.6) 


The net effect of “blazing” a grating is to get maximum efficiency in the same 
direction in which light would be reflected by specular reflection in the absence of 
diffraction. Note that in a Littrow configuration a = } = 6, and the light returns 
on itself. The groove profile for a transmission grating is similar to that shown in 
Fig. 13.8, where each facet is a long narrow prism. In this case the blaze peak is in 
the direction in which light would be refracted in the absence of diffraction. 
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Fig. 13.9. Schematic of blaze function envelope with blaze peak at 2 = 4,. See Eq. (13.4.7). 


The diffracted efficiency for a wavelength not at the blaze peak is determined 
by the value of BF for that wavelength in the diffracted direction. Figure 13.9 
shows two wavelengths, A, and A_, on opposite sides of 4,. The grating equation 
for these two wavelengths is 


mA, = osing + o sin (f, + £4) 
= md, — o sin B,(1 — cos e) + ø cos f, sin £4, (13.4.7) 


where $, is the diffraction angle at the blaze peak. 

We now choose £, as that pair of angles for which BF = 0.405, noting again 
that the width of a monochromatic single-slit diffraction peak between these 
points is A/b. For a grating groove b = ø cos ô and therefore e} = 24/20 cos ô. 
This relation for e, is accurate only in the small-angle approximation, but for 
most gratings of interest this accuracy is sufficient. 

Assuming £, is small and cos f, S cos ô, we drop the middle term in Eq. 
(13.4.7) and find the wavelength limits 





A, =—,, (13.4.8) 
+ m i 
mA À 
Ay A =n =a for large m. (13.4.9) 
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In first order, 1, = 2A, and 2_ = 2A,/3. The asymmetry in these wavelengths 
about the peak is expected because the blaze function is broader for the longer 
wavelength. 

The overall blaze function for wavelengths from A_ to 2, in first order is 
shown in Fig. 13.10. The values of v used to calculate BF at other wavelengths are 
given by v = mn(A, — A)/2 At the wavelengths A, given by Eq. (13.4.8), the 
reader can verify that v = +7/2, as expected. Published efficiency curves of first- 
order gratings with o greater than a few wavelengths are similar to the curve in 
Fig. 13.10, though there is a difference between different polarizations of light. 
As examples, see efficiency curves in the reference by Loewen and Popov (1997). 

For an echelle with large m, the asymmetry in the wavelengths given by Eq. 
(13.4.8) is much smaller and can usually be ignored. The blaze function spanning 
many echelle orders has the form shown in Fig. 13.11 for the Littrow config- 
uration. The width of each peak in the scalloped curve is given by Eq. (13.4.9). 

One important feature of the efficiency curve in Fig. 13.11 is that each peak 
covers one free spectral range of the echelle spectrum, as is seen by comparing 
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Fig. 13.10. Blaze function of grating with m = 1. See Eq. (13.4.8) and following discussion. 
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Fig. 13.11. Blaze function for echelle with m >> 1 in Littrow configuration. See Eq. (13.4.9). 


Eqs. (13.2.6) and (13.4.9). Hence each wavelength over an extended range is 
diffracted with an efficiency no less than 40% that of the blaze wavelength. 

When an echelle with rectangular grooves is illuminated at an angle a > 6, 
part of the width of each facet is not used and the effective groove width is less 
than o cos 6. From the geometry in Fig. 13.12 we find that the effective groove 
width b’ is 


b' = o cos &/ cos 8. (13.4.10) 


Because the groove width is smaller, the angular width of the blaze function is 
larger. At the same time, according to Eq. (13.2.2a), the angular dispersion is 
smaller and one free spectral range has a smaller angular spread. Combining these 
two effects we find that the fraction of BF spanned by one free spectral range is a 
factor cos «/ cos $ smaller than in the Littrow mode. Thus variation in the blaze 





Fig. 13.12. Cross section of echelle showing effective facet width b’ for « > 6. See Eq. (13.4.10). 
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function across a free spectral range is less pronounced when « > £ than in the 
Littrow mode. 

At the same time, however, the efficiency at the peak is reduced because some 
light at the blaze peak is now diffracted into neighboring orders. As shown by 
Bottema (1986), the peak efficiency is reduced by a factor of cos «/ cos $. The net 
result of all these effects is a broader, but lower, efficiency profile. Profiles of a 
single blaze peak are shown in Fig. 13.13 for 0 = 0 and 0 = 5°, at a blaze angle 
of 63.5°. Examination of these curves shows that the efficiency at the ends of a 
free spectral range is somewhat higher at 0 = 5° than at 0 = 0°. The average 
efficiency across the range, however, decreases as 0 increases from zero. 

The effects of groove shadowing for « > f are present for all gratings with 
rectangular grooves, but are much less significant for gratings with small blaze 
angle. For our examples in Section 13.3, cosa/cos $ for 0 = 5° is 0.95 for the 
grating and 0.70 for the echelle. 

When light is incident on an echelle at an angle a < f, each facet is fully 
illuminated but a fraction of the light is sent back in the general direction of the 
collimator. The effective groove width is again given by Eq. (13.4.10), as is 
evident from Fig. 13.12 when the arrows on the rays are reversed. Thus the blaze 





Fig. 13.13. Blaze function for single echelle order at two values of 0. Angular width of the free 
spectral range is AB. See the discussion following Eq. (13.4.10). 
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peak is again broadened. In this case, however, the angular dispersion is larger for 
increasing f and one free spectral range spans the same portion of the blaze peak. 
The peak efficiency in this case is reduced by the factor cos B/ cos a. 

A final point to be made on grating efficiency is the extent to which 
polarization effects are important. Let P and S represent the diffracted efficiency 
for light polarized parallel and perpendicular, respectively, to the length of the 
grating grooves. We define fractional polarization as |P — S|/(P + S). The size of 
this fraction is determined by the size of 4/6, hence the effect is smaller for a 
grating with larger grooves. Unpublished measurements at 2 = 480nm on an 
echelle with 79 grooves/mm and tan 6 = 2 give a fractional polarization of 0.02, 
where A/b = 0.085. Thus the curves in Figs. 13.11 and 13.13 are valid for either 
plane of polarization, to a good approximation. For a full discussion of grating 
efficiency consult the reference by Loewen and Popov (1997). 


13.4.b. RESOLUTION PRODUCTS 


The luminosity-resolution product for a grating at the blaze peak follows from 
a substitution of A from Eq. (13.4.6) into Eq. (12.2.11). The result, given in terms 
of d}, is 





F nDo'd, sindcosé 


L= 2  cos(ô+8) 


(13.4.11) 
Another useful product in assessing spectrometer capability at the blaze peak is 
found by substituting 4 from Eq. (13.4.6) into Eq. (12.2.3), with the result 
2d, sindcosO6 2W . 

R= D eee D sin ô cos 0. (13.4.12) 
For a grating with a small blaze angle, these products are essentially constant over 
a range of 0 of many degrees. With our grating example in Section 13.3, LR 
changes by less than 6% when @ is changed from 0 to 10°. For a typical grating, 
therefore, it is sufficient to set 0 = 0 in Eqs. (13.4.11) and (13.4.12). 

For an echelle with a large blaze angle, on the other hand, the dependence of 
L R on @ is an important one. Over the range of 0 in Table 13.3, YA changes by 
about 27% for our echelle example, assuming d, remains constant. The size of 
this change indicates that a closer look at this product for echelles is in order. 

It appears from Eq. (13.4.11) that YB can be made as large as desired for a 
given d) by choosing «=6+ 9 near 2/2. This is not feasible in practice, 
however, because the width W needed to collect all the light is larger than the 
width of any practical grating. An additional complication at these large angles is 
that the diffracted beam width d, is significantly greater than d) because of 
anamorphic magnification. This extended width makes the design of cameras 
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Table 13.4 


Relative Beam Widths for Echelles?? 


g@=5S° é=0 
tan ô W/d d,/d, W/d d,/d, 
2.0 2.72 1.42 2.24 1.00 
2.4 3.30 1.53 2.60 1.00 
2.8 3.95 1.65 2.97 1.00 
3.2 4.68 1.78 3.35 1.00 
3.6 5.48 1.92 3.74 1.00 


4.0 6.36 2.08 4.12 1.00 


9%7=6+0,B=6-8. 
b W/d, = 1/ cosa, d,/d, = cos ĝ/ cosa. 


more difficult at the small focal ratios usually required. As an illustration of these 
comments, data on beam widths for a set of representative blaze angles and 0 = 0 
and 5° are given in Table 13.4. Given the state of grating technology and camera 
design for large systems, spectrometer designs with W/d, <4 are the only 
reasonable choices for getting large 22. Hence R-4 echelles should be used very 
near Littrow. 

Most large, commercially available, R-2 echelles have a width-to-length ratio 
of two. From the data in Tables 13.3 and 13.4, we see that W/d, > 2, hence a 
beam unvignetted by the echelle can cover its width but not its length. This is 
illustrated in Fig. 13.14, where W, is the projected echelle width seen from the 
collimator and H is the height of the grooves. In terms of W, height H = W/2 
and W, = W cosa. 

Given this width-to-length ratio, the designer of an echelle spectrometer using 
a single echelle has three options: 


(1) Choose d) = W,, hence no vignetting at the echelle. 
(2) Choose d, > W,, but not larger than H, and accept some vignetting. 
(3) Choose « < f in a way to make d) = H = W.. 


In options (1) and (2), we assume « is larger than $. 

Option (1) is acceptable, but does not make full use of the echelle surface. 
Option (3) has the same # as option (1), but is constrained to cosa = 0.5, 
hence a < f for an R-2 echelle. As noted in the previous section, the variation in 
efficiency across a free spectral range is larger when a < ß, hence (1) is 
preferable even though its YZ is the same. 

Option (2) is a tradeoff between larger d, than (1) and smaller t due to 
vignetting. A larger d) translates into a wider entrance slit for the same projected 
slit width w’, a result that follows from Eq. (12.2.1a) written as w’ = rDf,o/d,. 
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Fig. 13.14. Overfilling of echelle to increase ZÆ product. See Section 13.4.b. 


A full analysis shows that the gain in light at the entrance slit more than offsets 
the light loss at the echelle, up to the point where d, = H. For angle 0 = 5° the 
net gain amounts to about 17%. 

It is important to note that another option available in the design of an echelle 
spectrometer is a mosaic of echelles, as done for HIRES, the High Resolution 
Echelle Spectrometer for the Keck 10-m telescope. This instrument, described 
by Vogt et al. (1994), has a beam diameter of 305mm and a 1 x 3 mosaic 
of 305 x 405mm R-2.8 echelles with an effective W œ 1220mm. Given 
tand = 2.8 and 0 = 5° we find cosa = 0.253 and W, = 310mm, a bit larger 
than the beam diameter. In this case the echelle as seen from the collimator is 
approximately square and the only vignetting is at the gaps in the mosaic. 

All of the relevant numbers needed to compute 2d for HIRES according to 
Eq. (13.4.12) are available in the preceding paragraph. It is left as an exercise to 
the reader to show that Zh = 0.225 or, in units of arc-sec, Ap = 46,000 arc-sec. 
We use this result in an example when discussing design considerations for 
grating and echelle spectrometers in Chapter 15. 


13.4.c. SURFACE HOLOGRAPHIC GRATINGS 


In the preceding sections we have discussed efficiency as it applies to 
classically ruled gratings with triangular groove shape. Another type of grating 
available is the surface holographic grating, produced by recording interference 
fringes from two expanded laser beams in a photosensitive material. This type of 
grating has a different groove shape than the ruled type, with efficiency a 


342 13. Dispersing Elements and Systems 


sensitive function of groove shape and number of grooves/mm. As pointed out by 
Loewen and Popov (1997), a disadvantage of the surface holographic grating is 
that its groove shape cannot be easily controlled. 

Although these gratings are competitive with ruled gratings when used in first 
order, they have not been made at the groove densities characteristic of echelles. 
First-order gratings are typically used at spectral resolutions of 1E3 to 1E4, and in 
this range either type of grating can be used. For high resolution in the range 3E4 
to 1ES, the echelle is clearly superior for broad spectral coverage. Further 
information on holographic gratings is found in the references at the end of 
this chapter. 


13.4.4. VOLUME-PHASE HOLOGRAPHIC GRATINGS 


A type of grating that shows promise for low and moderate resolution is the 
volume-phase (VP) holographic grating. In this type of grating the periodic 
diffracting structure arises from modulation of the index of refraction within the 
depth of the grating material. Light scattered from layers within a VP grating 
interferes constructively when the Bragg diffraction condition is satisfied. Details 
about the Bragg condition for different grating configurations are given in a paper 
by Barden et al. (1998). 

One particular characteristic of a VP grating that makes it an attractive 
alternative to a ruled grating is its predicted efficiency approaching 100%, in 
both planes of polarization, for layer densities of 600 per mm or higher. 
Measurements on unpolarized light for a 600!/mm VP transmission grating 
give an absolute peak efficiency of about 80%, including reflective losses, and an 
overall efficiency greater than 55% from 450-900 nm. The blaze peak and blaze 
envelope of such a grating can be tuned by rotating the grating. 

The technology of these gratings is in the development stage, but the work 
done to date indicates that VP gratings are likely to become viable options in the 
design of low-order grating spectrometers. This is especially true if VP gratings 
can be made in the sizes 300—400 mm needed for large telescopes. For a thorough 
review of the potential of VP gratings, see the paper by Barden et al. (1998). 


13.5. FABRY-PEROT INTERFEROMETER 


The Fabry—Perot spectrometer is an important instrument for astronomical 
observations that require very high spectral resolution of limited spectral ranges 
and/or large angular fields at moderate resolution. As we show, the Fabry-Perot is 
superior to a grating or echelle instrument in these cases. Although the Fabry- 
Perot was first used in astronomy as a scanning device with a single element 
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detector, it is now often used in the imaging mode. Both the scanning and 
imaging Fabry-Perot spectrometer modes are discussed in this section. 

Our discussion of the Fabry—Perot is relatively brief, with results given without 
derivation. The basic theory of the Fabry-Perot can be found in any optics text, 
and extensive discussions of its application in astronomy are given by Roesler 
(1974) and by Meaburn (1976). Although relations for the Fabry—Perot are often 
given in terms of wavenumber, the reciprocal of the wavelength, we choose to 
give all results in terms of wavelength. 


13.5.4. BASIC RELATIONS 
A schematic diagram of a Fabry—Perot spectrometer is shown in Fig. 13.15. 


For a Fabry-Perot with material of index n between interferometer plates of 
separation d, the normalized intensity of the transmitted light at angle a is given 


by 
0 L EAF 4R [ð ce 
aan (a a) | "aR (3)| ee 


where T and R are the fractions of the incident energy transmitted and reflected at 
each surface and 6, the phase difference between successive transmitted beams, is 


p= ZT and cos a. (13.5.2) 


The reciprocal of the quantity in square brackets in Eq. (13.5.1) is the Airy 
function. This function is a maximum when ô = 2mz, hence wavelengths are 
transmitted with maximum intensity when 


md = 2nd cos a, (13.5.3) 





Fig. 13.15. Schematic of Fabry-Perot spectrometer. P, prefilter-collimator combination; D, 
detector. 
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where m is an integer order number. Because the Fabry-Perot is axially 
symmetric, the pattern in the focal plane of a camera for a broad monochromatic 
source is a set of concentric rings with each ring a separate order. 

The normalized intensity in Eq. (13.5.1) is one if T +R = 1 and m is an 
integer. More generally we have A + T + R = 1, where A is the fraction of the 
incident energy absorbed at each surface. It is left to the reader to show that the 
ratio i,(max)/i,(min) is independent of A. 

The free spectral range AJ is found by the same procedure used in Section 
13.2.e, with a similar result, 


Ad = A/m = #7 /2nd cosa & A? /2nd. (13.5.4) 


Because m is usually large, a filter or predisperser is required to eliminate all 
unwanted orders. Given that « is generally a small angle, we assume cos « = 1 in 
the relations to follow, unless otherwise noted. 

The limit of resolution or spectral purity is determined by the characteristics of 
the Airy function, and the result is 


6A = Ad/N = 22 /2Nnd, (13.5.5) 


where N is called the finesse. The finesse depends on the plate quality and the 
reflectance of the coatings. With perfectly flat and parallel reflecting surfaces the 
reflective finesse Np = m/R/(1 — R). For plates of high quality with multilayer 
coatings, a typical effective value of N is in the range 30-50. 

From Eq. (13.5.5) the spectral resolving power is 


7 A NÀ 2Nnd 


=5= p (13.5.6) 


With the basic relations in hand, we now consider the different modes in which 
the Fabry-Perot is used. 


13.5.b. SCANNING FABRY-PEROT 


In one mode of operation a circular aperture isolates the central order and a 
single channel device, such as a photomultiplier, is the detector. Scanning of the 
Fabry-Perot is accomplished by changing either n or d in Eq. (13.5.3), with the 
scan rate set by the desired SNR in the output signal. The index n is changed by 
changing the pressure of the gas between the plates, while d is changed with 
piezoelectric actuators between the plates. A scan across one free spectral range is 
accomplished by changing the optical distance between the plates by 4/2. A good 
discussion of the piezoelectric scan method is given by Atherton (1987). 
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The optical layout of a scanning Fabry—Perot is generally similar to that shown 
in Fig. 9.9 for a focal reducer. The components include a field lens at the 
telescope focus to reimage the telescope exit pupil at the interferometer, a 
collimator preceding the interferometer, and a camera. The relation between 
the angle « at which the chief ray enters the collimator and the angle 0 on the sky 
is, from Eq. (9.4.1), 


af. = fO, (13.5.7) 


where f, and f are the focal lengths of the collimator and telescope, respectively. 
The angular diameter ôf of an axial hole that accepts all of the light in the full 
width at half-maximum (FWHM) of the central order is given by 


ôB = (8/2). (13.5.8) 
Substituting Eq. (13.5.8) into Eq. (12.2.8) gives the etendue for a circular hole as 
y =T MOBY’ _ zdi 27 


4 4 oR 
where the anamorphic magnification is one, hence d, =d,. Therefore the 
luminosity-resolution product for the Fabry-Perot is given by 


LR = n1 - nd? j4. (13.5.10) 


The transmittance t is the product of the transmittance ty of the optical elements 
and the average of the Airy function over the FWHM of the central order, where 
the latter factor is about 0.8. 





(13.5.9) 


13.5.c. IMAGING FABRY-PEROT 


An imaging Fabry-Perot differs from the scanning Fabry-Perot primarily at the 
detector end; a high efficiency area detector such as a CCD replaces the single 
channel axial detector. In effect, a single channel device has been converted into a 
multichannel one. The two systems are similar optically, with each having a field 
lens, collimator, and camera, in addition to the required filters to isolate the 
desired orders. Because of the detector the mode of operation is somewhat 
different. For the imaging Fabry—Perot the wavelength range of interest is 
typically scanned by making discrete changes in d, with the time spent at each 
setting determined by the desired SNR in the output signal. 

The outputs of the 2D detector give what is called a data cube. This cube 
consists of two dimensions of spatial information and one of spectral information. 
It is important to note that the transmitted wavelengths for a single slice of this 
cube follow Eq. (13.5.3), which we write as 


A= A, cosa, (13.5.11) 
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where A, is the wavelength transmitted along the optical axis. Thus a surface of 
constant wavelength in the data cube is a curved surface intersecting many slices 
of the cube. 

As an example of an imaging Fabry-Perot we take an f /8, 4-m telescope, and 
a Fabry-Perot system with 60-mm diameter plates, f/8 collimator, f/2 camera, 
and a CCD with 20-um pixels. The reader can verify that the scale at the 
telescope is approximately 160 um per arc-sec, hence two detector pixels span 
one arc-sec on the sky. Assuming the CCD is 1000 pixels on a side, the area on 
the sky is approximately 8 arc-min. If the Fabry-Perot is configured to give a 
resolving power of 1E4 on an extended source covering this area, the correspond- 
ing velocity resolution on the source is c/2 or 30km/sec. 

For further discussions of imaging Fabry-Perot systems the reader should 
consult the papers by Bland et al. (1990) and Reynolds et al. (1990). 


13.5.d. COMPARISON OF ECHELLE AND FABRY-PEROT 


We now compare Y& for the Fabry-Perot with that for a typical echelle 
spectrometer. In Section 12.2 we find d, 6a’ = Dd’ for any slit spectrometer and, 
after substitution into Eq. (13.4.11) and division into Eq. (13.5.10), we get 


(FP) _ trp (dep ae. 
oe (aera 5.1 
LA (E) te \dg) ôx tan’ ee) 





with the echelle in the Littrow configuration. Note that da’, the angle subtended 
by the slit length at the collimator, is D/d, times larger than the angular length ¢’ 
projected on the sky. 

For a well-defined projected slit, assume the largest practical da’ is about 1°. 
Assuming tanô = 2, the ratio in Eq. (13.5.11) is about 100 for the same 
transmittances and beam diameters in each instrument. If we assume the largest 
echelle beam is two times that of the largest Fabry-Perot beam, the Z ratio is 
reduced to 25, still substantially larger for the Fabry-Perot. The value of da’ 
assumed here is obviously appropriate for a large extended source. If D = 4m 
and d} = 200 mm, the corresponding angle on the sky is 180 arc-sec. For smaller 
values of da’, the ratio YF is even more in favor of the Fabry-Perot. 

We now compare a scanning Fabry-Perot and echelle for a stellar source, with 
each spectrometer on a 4-m telescope. We assume beam diameters are 100 and 
200 mm, respectively, for the Fabry-Perot and echelle, and the stellar seeing disk 
is one arc-sec. For an entrance slit width equal to the seeing disk diameter, the 
resolving power of the echelle is 4E4 in the Littrow configuration and about 20% 
larger for 0 = 5°. This resolving power is far below the limit of 1.6E6 and the 
echelle could be used to get higher resolving power, say ten times higher, but only 
at the expense of luminosity. 
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For the Fabry-Perot we assume # = 4ES, a resolving power appropriate for 
the observation of narrow spectral lines, for example. With this @ we find 
68 = 15 arc-min, which projects to about 23 arc-sec on the sky. Light loss at the 
entrance aperture is not a problem with this telescope- spectrometer combination 
and the system is far from any reasonable seeing limit. It is important to note that 
for 4 = 500 nm and N = 40 the free spectral range of the Fabry-Perot for this 
resolving power is only 0.05 nm, compared to 5 or 10nm for an echelle in this 
same wavelength range. 

These examples illustrate the kinds of observations for which the Fabry-Perot 
is especially well-suited, very high spectral resolving power on sources of small 
angular size or moderate to high resolving power for sources of large angular size. 

Although the Fabry-Perot clearly has an advantage over an echelle system in 
terms of the ZZA product, there are other considerations as well. One feature 
already noted here is the small free spectral range of the Fabry-Perot compared to 
the echelle. This, together with the area coverage of either a scanning or imaging 
system, makes the Fabry-Perot especially suited to studies of individual spectral 
lines of extended, emission line sources such as gaseous nebulae. Echelle 
spectrometers, on the other hand, are most often used in a mode in which the 
flux in all spectral elements over a wide wavelength range is recorded simulta- 
neously. Thus the echelle is well suited to the study of stars and near stellar 
sources. 

In summary, Fabry-Perot and echelle spectrometers are best viewed as 
complementary high-resolution spectral analyzers of celestial sources. 


13.6. FOURIER TRANSFORM SPECTROMETER 


Although a Fourier transform spectrometer (FTS), is not a dispersing system in 
the sense defined in Section 12.1, it gives an output from which the spectrum can 
be derived. Because the FTS is used to get spectral data, especially in the infrared, 
a brief discussion of its characteristics is in order. Results are given without 
derivation; for discussions of the theory of the FTS the reader should consult the 
references at the end of the chapter. 

An FTS is basically a scanning Michelson interferometer with collimated light 
as the input, as shown schematically in Fig. 13.16. The input beam is divided by a 
beamsplitter with approximately one-half going to each of the mirrors A and B. 
The light reflected from the mirrors is again divided by the beamsplitter, with 
approximately one-half of the original beam recombined and sent to the single 
channel detector D. 
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A 





=e — D 
Fig. 13.16. Schematic of Fourier transform spectrometer. A, fixed mirror; B, movable mirror; D, 
detector. 


13.6.a. BASIC RELATIONS 


For a collimated monochromatic beam the on-axis intensity at D is determined 
by the path difference between the two arms of the interferometer. Let x, and x, 
denote the respective distances from the center of the beamsplitter to the center of 
each mirror. The OPD between the recombined axial beams, assuming n = |, is 
2(x, — Xq) = 2 Ax, and the phase difference is 2k Ax, where k = 27/4. Note that 
Eqs. (13.5.2) and (13.5.3) for the Fabry-Perot apply to off-axis beams in the 
Michelson if d is replaced by Ax. 

The fraction 7(k, Ax) of the incident beam in the output beam is given by 


T(k, dx) =l + cos (2k Ax)]. (13.6.1) 


If the beams from the two arms are in phase [cos (2k Ax) = 1] we have T = 1; if 
the beams are z out of phase [cos (2k Ax) = —1] then T = 0. The relation in Eq. 
(13.6.1) is a direct consequence of two-beam interference. 

Given an incident beam whose spectrum is /(k), the flux F in the output beam 
is 


F (Ax) =C [re T(k, Ax)dk 


Cc 
= constant + F f (k) cos (2k Ax)dk, (13.6.2) 


13.6. Fourier Transform Spectrometer 349 


where C is a constant. The output ¥(Ax) for all Ax from a minimum value, 
usually zero, to the maximum value is called the interferogram. The integral in 
the second line of Eq. (13.6.2) is the Fourier cosine transform of the spectrum. 
From the theory of Fourier transforms, the transform of the recorded flux F (Ax) 
is the spectrum. 

The spectral resolving power 2 achievable with the FTS is directly propor- 
tional to the maximum Ax in the scan producing the interferogram, and is given 
by 


A _ 4Ax(max) 


S (13.6.3) 


If, for example, Ax(max) = 10cm, then 2 = 4E5 at 2 = 1000 nm. As in the 
case of a diffraction grating or Fabry-Perot, 2) is the maximum path difference 
divided by the number of wavelengths in this distance. 

Because Eq. (13.5.3) applies to both the Michelson and Fabry-Perot inter- 
ferometers, the relations for 68, U, and Z2 in Eqs. (13.5.8){(13.5.10) also apply 
to an FTS. As a consequence, the resolution achieved in practice is smaller than 
given in Eq. (13.6.3) by about a factor of two, when the angular diameter of the 
exit aperture is set according to Eq. (13.5.8). 


13.6.b. COMPARISONS AND COMMENTS 


Given the similarities between an FTS and a Fabry-Perot spectrometer and the 
discussion in the preceding section, it is evident that the FTS is also well-suited to 
observations requiring high spectral resolving power. An advantage of an FTS is 
that, unlike a scanning Fabry-Perot, all of the light in the passband of interest is 
being recorded all of the time in an FTS. Bland et al. (1990) has pointed out that 
this advantage is less significant for detectors with lower read noise. Based on Eq. 
(13.5.12) and the following discussion, it is also clear that an FTS has a 
significant “ advantage over an echelle. Thus the etendue advantage of an 
FTS over an echelle is maintained at high resolution. 

A disadvantage of an FTS is the extreme care that must be taken to produce a 
uniform scan. The value of Ax in Eq. (13.6.1) must be known to a small fraction 
of a wavelength to ensure that the transform of the flux in Eq. (13.6.2) gives a 
meaningful result. Stringent mechanical requirements on the scan mechanism and 
rigidity of the interferometer base are alleviated somewhat by using cube corners 
in place of the mirrors shown in Fig. 13.16. 

It is also possible to construct an imaging FTS by placing an area detector in 
the focal plane of a camera lens. An example of an imaging FTS operating at 
moderate resolution in the near infrared is described by Simons et al. (1994). 
Another variant of the FTS is a nonscanning system in which the mirrors A and B 
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in Fig. 13.16 are replaced by diffraction gratings, with each tilted with respect to 
the axial ray incident upon it. This type of interferometer produces a Fizeau 
fringe pattern at the detector. For a discussion of its features the reader should 
consult the paper by Harlander and Roesler (1990). 


13.7. CONCLUDING REMARKS 


The dispersive devices discussed in this chapter range from the simple prism 
to elegant interferometric devices such as the Fabry-Perot and Michelson spectro- 
meters. Most of the discussion, however, is given to diffraction gratings and their 
dispersive characteristics. This emphasis on gratings is simply a reflection of the 
fact that most of the spectral data on astronomical sources has been obtained with 
grating instruments. The versatility of grating spectrometers, especially their 
adaptabilty to observing multiple sources simultaneously, make them the choice 
for a wide range of observing programs. 

A quick look at the tables of contents of proceedings from recent conferences 
on astronomical instrumentation suggests that diffraction grating spectrometers 
will continue as the most common type of spectrographic instrumentation. 
Following an analysis of grating aberrations and a brief discussion of concave 
gratings in Chapter 14, we complete our discussion of spectrographic instruments 
with a thorough look at plane grating spectrometers in Chapter 15. 
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Chapter 14 Grating Aberrations; Concave 


Grating Spectrometers 


Most astronomical spectrometers use a grating or echelle as the primary 
dispersing element. As noted in Chapter 13, this choice is made for reasons of 
flexibility in choice of spectrum format and resolution to match modern detectors, 
and the ability to get broad spectral coverage at good efficiency. In this chapter we 
discuss the limitations on grating performance from the point of view of 
geometrical optics, both for the grating itself and for selected concave grating 
instruments. 

The limitations of a grating spectrometer are determined by two principal 
factors—the aberrations due to the collimator and camera optics, and the 
aberrations introduced by the grating. Grating aberrations are determined by 
the type of grating surface, whether it is plane or spherical, and the focal ratio of 
the incident beam. In most spectrometers the incident light is collimated, but there 
are systems in which the incident light is a convergent beam. 

In this chapter we use Fermat’s Principle to derive the aberrations of a grating 
surface, either reflecting or refracting. By combining these results with the 
aberrations of the other optics in a spectrometer, we can determine overall 
spectrometer performance. This system analysis is given for concave grating 
spectrometers in this chapter and plane grating instruments in the following 
chapter. 

The approach followed in deriving the grating aberrations is similar to those 
used by Beutler (1945), Namioka (1959), and Welford (1965). Each uses a 
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different notation and the reader should note these differences when comparing 
results given by different authors. Our approach parallels that given in Chapter 5 
for a general refracting surface, with those results modified to include the grating 
characteristics. 


14.1. APPLICATION OF FERMAT’S PRINCIPLE TO GRATING 
SURFACE 


Although a plane reflection grating is the element of choice in most grating 
spectrometers, we choose to set up a more general case in order to derive the 
general grating equation. After showing that this equation for a transmission 
grating is similar to that for a reflection grating, we carry out subsequent 
derivations of aberrations for the reflection case only. During this discussion it 
will become evident why plane gratings are preferred over spherical gratings for 
most applications. 

A sketch of a grating surface is shown in Fig. 14.1, with the origin of the 
coordinate system at the vertex of the surface. The grating grooves are taken 
parallel to the y-axis, with the separation between adjacent grooves, measured 
perpendicular to the yz-plane, given by o. The medium in front of the surface has 





Fig. 14.1. Coordinate system for grating with center at origin and rulings parallel to y axis. The Q 
and Q’ are object and image points, respectively; point B is on the grating surface. 
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index n; the medium on the other side has index n’. The object and image points 
are at Q and Q’, respectively, and an arbitrary ray from Q intersects the grating 
surface at B(x, y, z). The equation of the surface is given by Eq. (5.1.1) with K 
and b each set equal to zero. With this simplification we restrict our analysis to 
plane or spherical grating surfaces. 

Unlike the situation in Chapter 5, the grating surface is not rotationally 
symmetric about the z-axis. Thus we locate Q, as shown in Fig. 14.1, distances 
h and w from the xz- and yz-planes, respectively. The chief ray from Q to the 
origin of the coordinate system makes angle y with the xz-plane, and its projection 
on the xz-plane makes angle « with the z-axis. The image point Q’ and the 
refracted chief ray are defined in a similar way with primed quantities, except that 
B is used as the counterpart of «. 

It is important to note here that most spectrometers used on telescopes have 
y = 0 at the center of the entrance slit. This type of spectrometer is the so-called 
in-plane design in which the incident and diffracted chief ray are in the xz-plane. 
For an in-plane spectrometer with a long slit there is a range of y and, as we show 
here, the result is spectral line curvature. If y is not zero at the slit center, then the 
spectrometer is an off-plane design. In this case the spectral lines from a long slit 
source are both tilted and curved. The origin of these effects is discussed in what 
follows. 

Proceeding now with Fermat’s Principle applied to the arbitrary ray shown in 
Fig. 14.1, we write the optical path length OPL between Q and Q’ as 


OPL = n[QB] + n'[BQ'] + (m1/o)x, (14.1.1) 


where Fermat’s Principle is satisfied when 6(OPL) = 0 for any change in B on 
the surface. The wavelength å in Eq. (14.1.1) is the vacuum wavelength. To 
maintain consistency with the sign convention on angles, the plus sign on the 
right-hand term is applied to transmission gratings, and the minus sign to 
reflection gratings. 

The first two terms in Eq. (14.1.1) are the same as those in Eq. (5.1.2); the 
additional term is what makes the surface a grating. This is most easily seen by 
assuming x changes by ø, a step from one groove to an adjacent one. Because of 
this change in x there is an accompanying 6(OPL) = md, and a corresponding 
phase difference of 22m between the two points. If m is an integer, diffracted rays 
from these points are in phase and the effective change in OPL is zero. Thus these 
two rays interfere constructively at the image. Note that this conclusion is based 
on the assumption that the sum of the first two terms is the same for both rays, as 
required by Fermat’s Principle for a nongrating surface. 

The procedure from this point on is similar to that carried out in Chapter 5. 
The line segments in Eq. (14.1.1) are written in terms of the parameters in Fig. 
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14.1 and each, in turn, is expanded in powers of x and y. From the geometry in 
Fig. 14.1 we find 


[OB] = [œ — wy + (vy — AY + E — 2)”, 


(14.1.2) 
[BO] = [e -Ww H+- +H -2'1'”, 
where 
w=ssing, h=stany, Zo = SCOS Q, (14.1.3) 
w =s sinß, kK =s tany’, Zo = s' cos Bp. 7 


The distances s and s’ in Eqs. (14.1.3) are measured along the projections of [OB] 
and [BQ] on the xz-plane, respectively. The usual sign conventions apply to all 
parameters, with all angles in Fig. 14.1 positive, s, w, h, and zọ negative, and the 
primed distances positive. 


14.1.4. GENERAL GRATING EQUATION 


To derive the grating equation we need only the linear terms in the expansion 
of Eq. (14.1.1). Substituting Eqs. (14.1.3) into Eqs. (14.1.2), we find that the OPL 
can be written as 


OPL = n's —ns+y(nsiny — n’ siny’) 
+ x[n cos ysin « — n’ cosy sin B + (må/o)] 
+ terms in higher powers of x and y. (14.1.4) 
Terms in higher power lead to aberrations and their forms are given in a 
subsequent section. 
Taking the partial derivatives of OPL with respect to y and x and setting each 
equal to zero gives 
nsiny = n' siny’, (14.1.5) 
mì = +a(n' cosy’ sin B — n cos y sina). (14.1.6) 
These equations are the grating counterpart of Snell’s law for a general surface of 
revolution. Equation (14.1.5) applies in the yz-plane and is simply Snell’s law, 
while Eq. (14.1.6) is the grating equation. 


For a reflection grating n’ = —n, hence y’ = —y and, choosing the minus sign 
in Eq. (14.1.6), we get 


mA = no cos y(sin f + sina), (14.1.7) 


where n > 0 for light incident in the +z direction. Note that p is a function of y 
for constant « and A, hence the image of a long straight slit parallel to the y-axis is 
curved. The details of this curvature are discussed in the next section. 
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For a transmission grating surface we choose y = 0 and the plus sign in Eq. 
(14.1.6), and get mA = (n' sin $ — nsina). Applying this relation at the grating 
surface and Snell’s law at the other gives, assuming plane parallel surfaces, 


mi = no(sin B — sina), (14.1.8) 


where « is the angle of incidence at the first surface, p is the angle of diffraction 
following the second surface, and n is the index of the medium, usually air, in 
which the grating is located. The index n is positive for light directed in the +z 
direction. Note that the index of the blank is absent from Eq. (14.1.8). Thus the 
transmission grating is, in effect, a diffracting element of negligible thickness. 
There is also an element called a grism in which a transmission grating is put 
on one surface of a prism. The relation in Eq. (14.1.8) applies to the grating if the 
combination is treated as a prism in contact with a grating of negligible thickness. 


14.1.b. SPECTRUM LINE CURVATURE AND TILT 


The curvature of the image of a straight slit is a consequence of the 
dependence of $ on y noted following Eq. (14.1.7). At constant « and A we get 


dB _ siny(sina + sin f) 


= = t: f 4.1. 
dy cos y cos f Tny ae) 


where A is the angular dispersion of the grating for y = 0. For small y the total 
change in f between y = 0 and some largest yọ is found by integrating Eq. 
(14.1.9) between these limits. The result, expressed as Af, is 


AB = (75/2)AA. (14.1.10) 


Note that AB > 0, hence the change in f is toward longer wavelengths. In the 
camera focal plane, as shown in Fig. 14.2, the linear displacement from a straight 
image is f AB, where 


i 
2h 


where p = f,/4A is the radius of curvature of the image. 

The slope at a point on the curved image, relative to a straight image, is dB/dy 
as given in Eq. (14.1.9). Hence a short entrance slit for which y is not zero at the 
center is imaged as a tilted, though nearly straight, line. This assumes, of course, 
that the entrance slit is not tilted out of the yz-plane. Line tilt of this type is 
present in all off-plane spectrometer designs. 

Note that line curvature and tilt are present even if the grating and spectro- 
meter are otherwise free of aberrations. If the instrument has large astigmatism, 


2 
(fro)? = E, (14.1.11) 


frdp = A 
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Fig. 14.2. Spectrum line curvature in spectrometer focal plane with radius of curvature p. See Eq. 
(14.1.10). 


this curvature is superposed on any curvature of the astigmatic image. The reader 
can consult the reference by Welford (1965) for details on this combination. 

It is also important to note that curvature and tilt are larger for an echelle than 
for a typical grating of small blaze angle, in direct proportion to the angular 
dispersion. This factor is an important one to take into account in the design of 
any spectrometer, but especially so for an echelle instrument. 


14.2. GRATING ABERRATIONS 


The aberrations of a grating are found from the higher-order terms in Eq. 
(14.1.4). In the general case, these terms contain factors depending on siny and 
sin y’ to various powers. Most spectrometers, however, are of the in-plane design 
and we choose to set y and y’ to zero. The factors left out of the aberration 
coefficients with this choice are of order y? and y’? smaller than those that 
remain. 

Another simplification results if we consider only the case of a reflection 
grating. This is sufficient because the aberration coefficients for a plane reflection 
grating also apply to a transmission grating, given the discussion leading to Eq. 
(14.1.8). 

We also give only the squared and cubed terms in the OPL expansion, thus the 
only aberration coefficients we derive are those of astigmatism and coma. 
Spherical aberration, which comes from the fourth-power terms, is negligible 
in most cases. If it is significant for a specific grating type, its value is given in 
our discussion for that type. 
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Following our discussion in Section 5.1, we define ® as the optical path 
difference between the general and chief rays, and find 


DO = 








_nx’ (cos? $ _ cos*a cos B + cosa 
2 s’ s R 
ny? : 1 cosB+cos “| 


S sS R 


onë = B (= B cos F) ‘i sina (“= % cos 2)] 
s 

















2 s’ s! R s R 


nx [sinp f1 cosg sine 1 cose 
2 s’ s R s \s R 


= Ayx + Aly* + Apx? + Abr’. (14.2.1) 








Note that Eq. (14.2.1) does not contain the linear terms from Eq. (14.1.4); these 
are zero from the grating equation. 

A comparison of terms in Eq. (14.2.1) with corresponding ones in Eq. (5.1.5) 
shows that they are the same provided x and y are interchanged, 0 and 6’ are 
replaced by « and £, respectively, and n’ is replaced by —n. Note that the chief ray 
in Fig. 5.1 is in the yz-plane while the chief ray in Fig. 14.1 is in the xz-plane. 
Because of this correspondence we can use many of the results derived in Chapter 
5. One important difference in the results to follow is that the small angle 
approximation is not applied, except in selected cases. 

The locations of the astigmatic images are found by setting either 4, or 44 in 
Eq. (14.2.1) to zero. Setting 4, = 0 gives 


cos? B R cos? cosh + cosa 


s; s R , 





(14.2.2) 


where s; is the location of the tangential astigmatic image. This is a line image 
perpendicular to the xz-plane, hence parallel to the grating grooves, as shown in 
Fig. 14.3. The detector must be located at this image, if grating astigmatism is not 
to degrade the spectral resolution. For a plane transmission grating the right side 
of Eq. (14.2.2) is zero and the plus on the left side is changed to a minus. 

Setting 4; to zero gives s,, the location of the sagittal astigmatic image. Taking 
this expression and Eq. (14.2.2) we find that the separation As’ between the 
astigmatic images is 





a R cos? B s cos? B 
2A, _ 2A 


n n? 


As’ sin? B(cos B + cosa) sin? a — sin? B 
sis, 





(14.2.3) 
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y 





Fig. 14.3. Tangential (T) and sagittal (S) astigmatic images. Maximum spectral resolution is 
achieved with the detector at the T image. 


where the relation between As’ and Aj is the grating counterpart of Eq. (5.2.5) 
and follows by substituting s, for s’ in 41. The analog to Eq. (5.2.6) for the 
transverse astigmatism is 


TAS = y(As'/s,) = 2A, ys;,/n', (14.2.4) 


where the total length of the tangential image is 2 [TAS]. The length of the line 
image, in units of the grating groove length, is 


ITAS|/y = 244s; /n’. (14.2.5) 


We now return to Eq. (14.2.2) and select several combinations of s, and s that 
satisfy that equation. Commonly used names are given for each combination, or 
mounting as it is usually called, with results given in Table 14.1. 


Table 14.1 


Grating Mountings 





Grating s sS, Name 
Concave Rcosg Rcosß Rowland 
R 2 
Concave o0 RE Wadsworth 
cos $ + cos % 
Plane oo foe) 





2 
Plane s (2 B ) Monk-Gillieson 
cos? & 
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The characteristics of each mounting in Table 14.1 are easily described in 
terms of the object and image locations. For the Rowland mounting the entrance 
slit and image lie on the Rowland circle, a circle of diameter R tangent to the 
concave grating at its vertex. The grating in the Wadsworth mounting is 
illuminated with collimated light and the curved focal surface is roughly a 
distance R/2 from the grating vertex. A convergent or divergent light bundle is 
incident on the grating in the Monk-Gillieson mounting. The object and image lie 
on opposite sides of a reflection grating, and the minus sign in Table 14.1 applies. 
For a transmission grating the plus sign in Table 14.1 applies and object and 
image are on the same side. 

We get the transverse astigmatism for each mounting in Table 14.1 by 
substituting R, s, and s, into Eqs. (14.2.3) and (14.2.4). The results are given 
in Table 14.2, where TAS is expressed in units of grating groove length. There is 
no entry for a plane grating in collimated light because its A} is zero. 

From the entries in Table 14.2 we see that astigmatism is zero for the Rowland 
mounting only when « = f = 0, corresponding to the zero order. The Wadsworth 
mounting has zero astigmatism on the grating normal and small astigmatism over 
a limited range on either side. For the Monk-Gillieson mounting, the astigmatism 
is zero when 6 = +a, where the minus sign gives the zero order for a plane 
reflection grating and the plus sign is the zero order for a transmission grating. 
For either plane grating in this mounting, therefore, there is a direction in which 
astigmatism is zero at a diffracted wavelength given by mA = 20 sin £. 

The coefficients A, and A) in Eq. (14.2.1), with —n’ substituted for n, are 














Ay mes cos? a cosa\/sina _ sin f (14.2.6) 
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Table 14.2 


Astigmatism of Grating Mountings 








TAS 
Rowland: ao? A sin? B + sin? 0( £) 
yX cosa 
Wadsworth: TAS = sin? f 
y 
TA: yy cys 
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where Eq. (14.2.2) is used to simplify (14.2.6). Relations analogous to Eqs. 
(5.3.4) and (5.3.5) for these coefficients are 


si s; 
TA, = GA? +A) TA, = 7524y, 


where TA denotes the transverse aberration. The expressions for the transverse 
tangential and sagittal coma are 


TTC = 3A,x’s//n', TSC = Ays; /n'. (14.2.8) 


With reference to Fig. 5.8 with x and y interchanged, the relations in Egs. (14.2.8) 
give the coma in the direction parallel to the x-axis. If coma is present, its effect is 
to degrade the spectral resolution. 

We now evaluate Eq. (14.2.6) for each of the grating mountings in Table 14.1. 
It turns out that 4, is comparable in size to A, and we give its value only for the 
one mounting in which 4, = 0. The results are shown in Table 14.3. 

From the entries in Table 14.3 we see that TTC for the Rowland mounting is 
zero and TSC is small because 4, goes as the cube of factors that are usually 
small. Coma is zero for the Rowland mounting only in the zero order. The 
Wadsworth mounting has zero coma on the grating normal and small coma over a 
limited range on either side. For the Monk-Gillieson mounting the plus and minus 
signs apply to reflection and transmission gratings, respectively, and coma is zero 
only in zero order. 

Spherical aberration is negligible in all practical configurations of the Rowland 
and Monk-Gillieson mountings. The spherical aberration coefficient for the 
Wadsworth mounting on the grating normal is given by 

n 
8R3 
with the transverse spherical aberration given by Eq. (5.4.1). Its size, to a good 
approximation, is the same as that of a sphere in collimated light. 


A; = cos? «(1 + cosa), (14.2.9) 


Table 14.3 


Coma of Grating Mountings 





Rowland: A,=0 
A 
A= pe (sin f tan? B+ sina tan? a) 
: n' sin B [cos a(cos 8 + cosa) 
Wadsworth: A, = JRZ | cos? B 
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Table 14.4 


Tangential Image Surface Curvatures 


Rowland: K; = —2/R 
Wadsworth: K; = —(2/R) (1 + s cos a) 
Monk-Gillieson: K, = +(3 cos? a)/s 


The aberration coefficients for the concave grating mountings are derived 
assuming the pupil is at the grating. Given the limited sizes of concave gratings, 
this is the only practical way of using the grating. A plane grating has no preferred 
axis and the aberration coefficients are independent of the pupil location. 

For each of the grating mountings, we now determine the curvature of the 
image surface on which the tangential astigmatic images are located. This is 
easily done by applying Eqs. (5.7.1) and (5.7.2) to the relations for s, in Table 
14.1. The curvatures obtained are given in Table 14.4, where the plus and minus 
signs for the Monk-Gillieson mounting apply to the reflection and transmission 
cases, respectively. 

Note that the curvature for the Rowland mounting is exact because the images 
lie on the Rowland circle. For the other mountings in Table 14.4 the curvatures 
are only approximations to the exact curvatures, though the relations given are 
adequate for most configurations of these mountings. The image surface for each 
mounting is concave as seen from the grating. 

All of the aberration relations needed for analysis of the different mountings 
are now in hand. The plane grating in collimated light has no aberrations, and 
aberration analysis of spectrometers with this mounting is reduced to considering 
the collimator and camera optics and any anamorphic magnification of the 
grating. The rest of this chapter is a discussion of the characteristics of selected 
concave grating spectrometers. Discussion of the characteristics of plane grating 
instruments follows in Chapter 15. 


14.3. CONCAVE GRATING MOUNTINGS 


In this section we look further at the characteristics of the two concave grating 
mountings introduced in the previous section. Although these mountings are little 
used, if at all, for stellar spectroscopy with ground-based telescopes, they are 
often used for ultraviolet spectroscopy from space. In this spectral region they are 
practical alternatives to plane grating spectrometers. For information on other 
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concave grating mountings, such as the Seya-Namioka monochromator (see 
Namioka, 1959), the references should be consulted. 


14.3.4. ROWLAND MOUNTING 


The Rowland mounting was probably the first type of grating spectrometer 
used for astronomical spectroscopy. A schematic diagram of a Rowland mounting 
is shown in Fig. 14.4. Although it was adequate for recording solar spectra, it was 
a failure in stellar spectroscopy, primarily because of its astigmatism. As an 
example of the size of the astigmatism, we take 4 = 500nm for a first-order 
grating with 600 grooves/mm and a diameter of 100mm. From the relation in 
Table 14.2 we get TAS/y = 0.090 for «= 0 and TAS/y = 0.045 for « = f, 
hence image lengths are 9 and 4.5mm, respectively, for a point source at the 
entrance slit. 

Note that these lengths depend only on the grating size and are independent of 
the output beam focal ratio. Given these image lengths, the speed of a Rowland 
spectrometer is much less than that of a stigmatic spectrometer at the same 
camera focal ratio because the light is spread over a significantly larger area. 

Rowland spectrometers are used in the extreme ultraviolet where reflection 
efficiencies are low and the number of optical surfaces must be kept to an 
absolute minimum. In this spectral range both near-normal and grazing incidence 
mountings are used. Because of the shorter wavelengths, astigmatism is tolerable 
for near-normal mountings. For the grating example in the foregoing text, the 


FS 


Fig. 14.4. Schematic of Rowland mounting; S, entrance slit; G, concave grating; C, center of 
curvature of grating; FS, focal surface. 
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astigmatism is smaller by a factor of 25 at å = 100nm for the same grating 
diameter. A comparison of different Rowland mountings suitable as spectro- 
meters in the ultraviolet has been given by Namioka (1961). 

Grazing incidence spectrometers based on the Rowland circle are used at still 
shorter wavelengths, and their large astigmatism is reduced by using ellipsoidal 
and toroidal concave gratings. For further discussion of grazing incidence 
instruments, the reader should consult the references at the end of the chapter. 


14.3.b. WADSWORTH MOUNTING 


The Wadsworth mounting was a successful replacement for the Rowland 
mounting in the early days of stellar spectroscopy, primarily because it has 
significantly smaller astigmatism. With the same grating parameters as in the 
preceding, the Wadsworth can be made stigmatic at 2 = 500nm by choosing 
sin « = 0.3. At wavelengths of 400 and 600 nm we find that TAS/y is 0.0036, and 
the image length is 0.36mm for the same grating diameter. This image length 
increases for wavelengths farther from the stigmatic wavelength, roughly as the 
square of the wavelength difference. 

Comparing the astigmatic image lengths of the Wadsworth and Rowland 
mountings, it is evident that any decrease in transmittance of the Wadsworth due 
to additional optical elements is more than compensated by its greater speed. This 
advantage of the Wadsworth over the Rowland mounting holds at shorter 
wavelengths, provided high-efficiency reflecting films are available. 

With sina = 0.3, it is straightforward to calculate the transverse coma and 
spherical aberration. Using A, from Table 14.3 and Eqs. (14.2.8) we find 
TTC = 0.175(d/F) sin $, where d is the grating diameter and F is the grating 
focal ratio. With d = 100mm, as in the preceding, and F = 10, we find the 
magnitude of TTC = 0.1 mm at 400 and 600 nm. Because coma is proportional 
to sin $, its size is linearly dependent on the difference between the actual and 
corrected wavelengths. 

Using Eqs. (14.2.9) and (5.4.1) we find that TSA = 0.014d/F, hence for our 
chosen d and F we get TSA = 14 um. Compared to the size of the coma, it is 
evident that spherical aberration is negligible over most of the spectral region 
centered on 500 nm. 

Because the grating in the Wadsworth mounting has collimated light incident, 
a separate collimator is required. A schematic diagram of a Wadsworth spectro- 
meter is shown in Fig. 14.5, with a flat fold mirror in series with an on-axis 
collimator mirror. In a correctly designed system for a Cassegrain telescope, the 
fold mirror is entirely inside the shadow of the secondary and does not vignette 
the collimated beam. 
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ee 


FS 


Fig. 14.5. Schematic of Wadsworth spectrometer; S, entrance slit; M, collimator; G, concave 
grating; FS, focal surface. 


The collimator has no off-axis aberrations in this arrangement and spherical 
aberration is also zero if the mirror is a paraboloid. If the mirror is spherical, its 
spherical aberration adds to that of the grating. Other possible collimators include 
an off-axis parabolic mirror and a tilted concave mirror. The former is free of 
aberrations on-axis, while the latter has off-axis aberrations. The optical proper- 
ties of a tilted spherical collimator and a concave grating have been calculated by 
Namioka (1959) and Seya and Namioka (1967), and these references can be 
consulted for details. They consider both collimated and noncollimated light 
incident on the grating. 


14.3.c. INVERSE WADSWORTH MOUNTING 


In the standard Wadsworth mounting the grating is both the disperser and 
camera. In the inverse Wadsworth mounting the slit is at the focus of the grating, 
which is both the disperser and collimator, and a separate camera is required. The 
general aberration relations, not given here, are derived using the procedure 
already provided here. From these relations it turns out that coma and astigmatism 
are smallest at a given wavelength when « is zero. This is not surprising from the 
point of view of Fermat’s Principle; compared to the standard Wadsworth, whose 
aberrations are zero on the grating normal, the light rays are simply reversed in 
direction. 
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Assume a camera of focal length f that is aberration-free, and a concave 
grating for which the collimator focal ratio is F. The transverse aberrations of the 
inverse Wadsworth at « = 0 are given by 








_ f [sin B sin’(B/2) 
pean Ts | cos? B | 
hee (14.3.1) 
TTC = 3f ee ‘aad 
16F2 cos? B 


where the total length of the tangential astigmatic image is 2 TAS. Given 
sin $ = mi/o when « = 0, we see that coma and astigmatism increase roughly 
proportional to the cube and fourth power of the wavelength, respectively. 

With F = 10, f = 1000 mm, and the same 600 groove/mm grating as used 
previously here, the reader can verify that the image length and TTC are 
approximately 0.22mm and 14m, respectively, at a wavelength of 500 nm. 
Compared to a standard Wadsworth of the same focal length, the average coma 
and astigmatism are significantly less in the visible and ultraviolet spectral 
regions. The spherical aberration coefficient of the inverse Wadsworth is the 
same as that of a standard Wadsworth. 

From the forementioned aberration relations, we see there is freedom to 
choose a shorter camera focal length and thereby reduce the transverse aberra- 
tions. A Schmidt camera, with its corrector plate close to the grating, is a practical 
choice for a fast camera in an inverse Wadsworth. Both the incident and diffracted 
light from the grating pass through the corrector in this arrangement and, with the 
stop at the grating, the corrector size is essentially the same as that of the grating. 


14.3.d. CONCLUDING REMARKS 


From our discussion on concave grating mountings, it is evident that there are 
useful configurations, especially for short wavelengths, in which aberrations are 
tolerable. None of these mountings, however, has found favor on ground-based 
telescopes. One reason is that, except for the inverse Wadsworth, there is little 
freedom in the choice of camera focal ratios and match of projected slit to pixel 
size. 

Another shortcoming of the concave grating is its efficiency relative to that of 
the plane grating. A concave grating ruled by conventional methods has lower 
overall efficiency than a plane grating because grooves near one side of the 
concave grating have a different effective blaze angle than those near the other 
side. To overcome this defect, at least in part, concave gratings with bipartite and 
tripartite rulings have been made. For more information about these types of 
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rulings and concave gratings in general, including holographic concave gratings, 
the reader should consult the reference by Loewen and Popov (1997). 
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Chapter 15 Plane Grating Spectrometers 


Plane grating spectrometers have distinct advantages over concave grating 
instruments and have been the almost universal choice for large telescopes. 
Spectrometers in which the grating is illuminated by collimated light have only 
the aberrations of the collimator and camera degrading the spectrum quality. With 
careful attention to the design of these optical subsystems, aberrations can be 
reduced to an insignificant level. In addition, the freedom to choose collimator 
and camera focal ratios independently to get a desirable match between projected 
slit and pixel size is a significant advantage. 

Other important advantages of plane gratings are the wide range of sizes, 
groove densities, and blaze angles available as standard items. The selection 
makes it possible to design and build systems, either fiber-fed or with a standard 
slit, suited to almost any observing program. For high resolution, large echelles 
are especially suited to configurations in which many orders are arranged to cover 
a convenient 2D format. For low spectral resolution, transmission gratings and 
grisms used in a nonobjective mode are well-suited for survey programs. 

In this chapter we emphasize the design principles of plane grating spectro- 
meters and cite examples of existing instruments for many of the configurations 
discussed. For a full range of design possibilities the interested reader should 
consult any of the conference proceedings cited at the end of the chapter. 
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15.1. ALL-REFLECTING SPECTROMETERS 


We first consider plane grating spectrometers with mirror optics to collimate 
and focus the dispersed light. Instruments of this type, such as the Czerny-Turner 
and Ebert-Fastie designs, are used in laboratories and have been used as 
astronomical spectrometers. The major disadvantage of the designs discussed 
in this section is the lack of freedom to choose the camera focal ratio 
independently of the collimator focal ratio. In spite of this lack of flexibility, 
the characteristics of these designs merit some discussion. 


15.1.a. CZERNY-TURNER MOUNTING 


The Czerny-Turner design is a widely used laboratory spectrometer suited for 
either the monochromator or spectrograph mode. This design is also used for 
astronomical spectrometers, for example, the main spectrograph at the McMath 
solar telescope at Kitt Peak National Observatory, Arizona is of this type. We 
discuss the characteristics of both modes of operation, starting with the mono- 
chromator mode. 

The optical layout for a Czerny-Turner mounting (CZ), is shown in Fig. 15.1. 
The spherical collimator and camera mirrors are M, and Mbp, respectively, G is the 
grating, and the entrance and exit slits are at Q and Q’, respectively. The axis of 
each mirror is tilted with respect to its incident chief ray, and thus each has both 
on- and off-axis aberrations. 

To ensure that the tangential fan of rays incident on the grating is strictly 
collimated, the slit at Q is at the tangential focus of the collimator. The 
consequence of this choice is that the distance from the camera vertex to Q’ is 


Xi 





Fig. 15.1. Optical arrangement of Czerny-Turner mounting; Mj, collimator; G, grating; M3, 
camera; Q(Q’), entrance (exit) slit. Dispersion direction is in the plane of diagram. 
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independent of the grating angles and separation between the grating and each 
mirror. The spectrum is scanned by rotating the grating about an axis along its 
central groove. 

The aberrations of the CZ mounting are easily found using the results in 
Chapter 5. In the monochromator mode, the position of the stop is of no 
consequence because the beam location at the camera is always the same. Thus 
we can take the coefficients for each mirror from Table 5.2, where the 
magnification m is infinite for the collimator and zero for the camera. One 
change is needed in these coefficients before writing the system coefficients. The 
coefficient A, denotes the astigmatism at the sagittal image, but for a grating 
instrument we require the astigmatism at the tangential image. Hence we take 4} 
instead of A, where, as seen in Eqs. (5.2.3) and (5.2.4), these coefficients have 
opposite signs. This sign change has no effect on the final relations that give the 
length of the astigmatic image. 

With this change we take the coefficients from Table 5.2 and substitute them 
into Eq. (5.6.7) to get the system coefficients. The results are 


i 0 @) 
A, = z t=], (15.1.1) 
NR Ay 
9, 9 a) 
A= 5-51}. 15.1.2 
äi Ri Ri G i ) 
where y; = y2, 2; = m = —1 from Fig. 15.1, and subscripts 1 and 2 refer to 


collimator and camera mirrors, respectively. 

Because R, and R, have the same sign, the astigmatisms of the two mirrors 
add to give the system astigmatism. A similar result is found for the spherical 
aberration of the system. With a proper choice of angles, however, the coma 
coefficient can be made zero. 

The relation between x, and x, is derived from the geometry in Fig. 15.1. If x 
is the coordinate of a marginal ray at the grating, then x, cos 60; = xcosa and 
x, cos 0, = xcos f. Putting these results into Eq. (15.1.2), and setting A,, = 0, 


gives 
9, 6) (= a) (= Al 
= = | = . (15.1.3) 
0, R; cos By] \cos 6, 
The relation in Eq. (15.1.3) is an approximation to the exact third-order equation 
in which the left side is the ratio of the sines of the angles. As we will show, it is 
necessary to have 0; and 0, small in order to keep the astigmatism small. For 
most purposes, therefore, the paraxial approximation is adequate and cos 0 factors 
can be replaced by one. 
Note that a choice of 6, and 0, to satisfy Eq. (15.1.3) is possible for any set of 
grating angles, but once chosen there is a small residual coma at other 
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wavelengths. This residual, a result of wavelength-dependent anamorphic magni- 
fication, has been called subsidiary coma by Welford (1965). The size of the 
subsidiary coma is given in an example to follow. 

Putting Eq. (15.1.3) into Eq. (15.1.1), with factors in cosa and cos By 


retained, we find 
6 R, 3 /cos Xo : 
A, = — = ; 15.1.4 
‘a R, já (z) (Se) : 


The transverse aberrations are found by substituting the system coefficients into 
Eq. (5.5.9). The results are given in Table 15.1, with d, and F; denoting the 
collimator diameter and focal ratio, respectively. 

The relations in Table 15.1 describe a spectrometer in which the coma is zero 
at one wavelength and negligible over an extended range of wavelengths. It is 
useful at this point to give the characteristics of a representative example, that of 
an f/10 spectrometer with fi = 1m and R; = R, = 2m. We choose a — B = 8° 
and 6, = 3° to provide clearance between the optical elements and beams. With a 
600 groove/mm grating used in first order, and choosing zero coma at 
2o = 500 nm, we find 6, = 2.814°. 

With these parameters we find that the image length is 0.52 mm at the zero- 
coma wavelength and TSA = 31 um. At +200nm from the corrected wave- 
length, the transverse coma is 2.6 um and the length of the astigmatic image is 
unchanged. For all practical purposes over this range, coma is negligible, 
astigmatism is constant, and spherical aberration determines the spectral resolu- 
tion. With a plate factor of 1.67 nm/mm, the resolution at the minimum width of 
the spherical aberration blur is 0.026 nm. 





Table 15.1 


Transverse Aberrations of Czerny-Turner 
Monochromator? 


Od, (Ry R, 4 COS æo $ 
TAS = 1 
3 2 (G) i (=) in 
30,d, (Ry coso cos f 3 
TTC = 1 
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d, (Ry R,\? 
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* Angle 0, is chosen to give zero coma at 
à= ło. All cos@, and cos@, factors are 
omitted. Length of astigmatic image is 
2. TAS. 
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From this example it is evident that a CZ monochromator gives good spectral 
resolution, provided the focal ratios of the camera and collimator are sufficiently 
large. An instrument of this type is suitable for stellar sources, but its astigmatism 
limits its usefulness for extended sources if spatial resolution along the slit is 
required. The angle corresponding to the image length projected on the sky can 
be calculated using Eq. (12.2.1b). 

We now consider the CZ in the spectrograph mode, with the grating and 
camera mirror shown schematically in Fig. 15.2. The z-axis of the mirror is the 
normal to the mirror at its center, with the angle of the chief diffracted ray 
measured with respect to this axis. An arbitrary chief ray makes angle 0, with the 
axis, with 09 used for the chief ray at the mirror center. 

The effective position of a pupil centered on the z-axis is shown in Fig. 15.2 as 
a distance W’ from the mirror vertex. Denoting the grating-mirror separation 
measured parallel to the z-axis by W we find, in the paraxial approximation, 
W'0, = W(0, — 89). 

The aberration coefficients for the camera mirror are now found by substitu- 
tion of W’ for W and 6, for y in the relations in Table 5.6. Combining these with 
the coefficients of the collimator gives 


a wW ? 
OEO] , (15.1.5) 
@ 1 W 
B,, =F | - 69) (15.1.6) 


where the paraxial approximation is used for all angles. This approximation is 
quite adequate for calculating the image aberrations in the spectrograph mode, 
and is used in the discussion and example that follow. 

Examination of Eqs. (15.1.5) and (15.1.6) shows that the coefficients are 
independent of 0, when W = R,. The transverse aberrations of the spectrograph 
are then given by the relations in Table 15.1. An instrument with this choice of W 
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Fig. 15.2. Diffraction and camera angles for Czerny-Turner spectrograph. C, center of curvature 
of mirror M2; W’, distance of apparent stop from M3 with grating as stop. 
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has been described by Willstrop (1965). The principal disadvantages of a 
spectrograph with this W are its longer length and curved focal surface. 

Returning to the geometry of the chief rays in Fig. 15.2, we see that 
P — By = —(02 — 929). In the paraxial approximation to the grating equation 
we find $ — Po = m(A—A,)/o. With this substitution in Eqs. (15.1.5) and 
(15.1.6), the system coefficients are expressed in terms of the wavelength 
difference with respect to the wavelength at the center. The transverse aberrations, 
given in this form, are shown in Table 15.2, where @9 is chosen to give zero 
coma. 

The relations in Table 15.2 show that the coma is linearly proportional to the 
wavelength difference, while the astigmatism depends on this difference in a 
more complicated way. As an example, we take the same grating and mirror 
parameters used for the monochromator, with 2) = 500 nm and W/R, = 0.5. The 
coma for AA = +50nm is 28 um, and the image lengths are 0.41, 0.53, and 
0.73 mm at A = 550, 500, and 450 nm, respectively. The results obtained from ray 
traces are within a few percent of these values and justify the approximations 
used. 

Unlike the monochromator mode, neither the coma nor astigmatism are 
constant, and both the coma and spherical aberration determine the spectral 
resolution. Nevertheless, the CZ mounting is practical in the spectrograph mode, 
provided a modest spectral range is covered. 

The final item of interest for the spectrograph mode is the curvature of the 
tangential focal surface, found using the geometry in Fig. 15.3. The line O’C 
passes through the center of the grating, which is the stop, and the center of 
curvature C of the mirror. A chief ray that makes angle 6, with the line OC makes 
angle y with the line O’C. Because all rays from the grating are parallel, an 
imaginary ray reflected at O’ from an extension of the mirror makes angle 
6’ = —w with the line O’C. 

Relative to the line O’C, which is effectively a z-axis of the mirror, the 
astigmatism coefficient is B, in Table 5.6. This coefficient cannot be used to find 
the transverse astigmatism because the actual mirror is not centered at O’, but it 


Table 15.2 


Transverse Aberrations of Czerny-Turner Spectrograph? 
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7 Angle 649 is chosen to give zero coma at A = dy. 
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Fig. 15.3. Geometry for finding orientation and curvature of focal surface for Czerny-Turner 
spectrograph. 


can be used to find the image surface curvature. Substituting B, in x, in Table 5.7, 
and noting that 6’ = y, we find 


2 6 wr? 
K; =o R (1 x) : (15.1.7) 
where W is the distance O’G in Fig. 15.3. The center of curvature of the image 
surface is on the line O’C, and a flat image surface is perpendicular to this line. 

Setting x,=0 gives the condition for a flat image surface as 
W/R, = 1+1/3, with the minus sign a more convenient choice. With this 
value for W/R,, the coma measures for the preceding example are increased by 
about 15%, but this is a small price to pay for the convenience of a flat field. 

The monochromator and spectrograph modes of the Czerny-Turner design 
have been thoroughly analyzed in far more detail than given here. Higher-order 
aberrations and detailed ray trace analyses lead to refinements of the relations 
given here, and the interested reader should consult the references at the end of 
the chapter for more information. Included in these references are treatments of 
off-plane designs and instruments with curved slits for achieving the highest 
possible spectral resolution of broad sources. These treatments are not included 
here because for astronomical applications the in-plane design with a short slit is 
best suited to observations of point sources. 


15.1.6. EBERT-FASTIE MOUNTING 


The Czerny-Turner design can be considered a generalized version of the 
Ebert-Fastie design. In the Ebert-Fastie mounting, hereafter denoted EF, a single 
spherical mirror serves as both collimator and camera, thus eliminating the need 
for alignment of separate mirrors. This mounting is almost always used in the 
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monochromator mode with the grating generally located at the paraxial focal 
point of the spherical mirror. Figure 15.1 applies to the EF mounting if the z-axes 
are redrawn to intersect at the common center of curvature for mirrors M; and 
M>. If the incident and emergent chief rays to and from the mirror are parallel in 
this redrawn diagram, the angle « — £$ is nearly equal to 2(0, + 02). 

An analysis of the aberrations of an EF mirror with an intermediate plane 
mirror is given in Section 5.8, where we show that this configuration is an 
anastigmat if the mirror is a paraboloid. For a spherical mirror both spherical and 
astigmatism are present, though these do not seriously degrade the spectral 
resolution if the beam focal ratio is in the range f/10 or slower. Coma is zero for 
the all-mirror system in Section 5.8, but the replacement of the plane mirror with 
a grating introduces subsidiary coma as noted following Eq. (15.1.3). 

The aberrations of the Ebert-Fastie monochromator are given in Table 15.1 
with R; = Ry. References on this mounting are listed at the end of the chapter. 


15.1.c. MONK-GILLIESON MOUNTING 


We now consider briefly the Monk-Gillieson mounting in which a plane 
grating is illuminated with a convergent light bundle. In this mounting the grating 
contributes to the system aberrations, but with proper arrangement of the optical 
elements these aberrations can be made zero at one wavelength. At other 
wavelengths, coma and astigmatism are present in amounts directly proportional 
to the wavelength difference with the corrected wavelength. Because of these 
wavelength-dependent aberrations, this mounting is best suited to the mono- 
chromator mode. 

A schematic layout of a Monk-Gillieson monochromator, hereafter called MG, 
is shown in Fig. 15.4. The entrance and exit slits are at Q and Q’, respectively, 





Fig. 15.4. Optical arrangement of Monk-Gillieson mounting. Q(Q’), entrance (exit) slit; M, 
spherical mirror; G, plane grating. 
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and M is a spherical mirror. If the grating G were absent, point Q would be 
imaged at Q”. The distances from the grating vertex to Q’ and Q”, respectively, 
are s’ and s, where s’ refers to the tangential astigmatic image. The distance from 
the mirror vertex O to Q” isr=s+r’. 

In the discussion that follows, all angles are assumed small enough so the 
paraxial approximation applies, hence the grating equation is mj = no(B + 4), 
where n = —1 for the layout in Fig. 15.4. We also define y = « — f, and note that 
s’ = —s in this approximation. In terms of these parameters, we find the system 
aberration coefficients by the usual method, and use Eqs. (5.4.1) to find the 
transverse coma and astigmatism. Omitting the details here, we find 





3s m 
TTC = — —jA-/, 15.1.8 
Ood m 
TAS = —|A—Apl, 15.1.9 
IM +1] oa A= Al ( ) 


where d is the beam diameter at the grating, F, is the grating beam focal ratio, 
and A, is the wavelength at which both coma and astigmatism are zero. Here M is 
used for the magnification of the mirror to avoid confusion with the order number 
m. 

The aberration coefficients used to construct Eqs. (15.1.8) and (15.1.9) contain 
the angles 0 and y. Setting these coefficients to zero at the same wavelength A, 
gives 


mig sR 20o 


ec ea cl E 
0 %=-M+I 


15.1.10 
o r(M+1)’ ( ) 


For the layout shown in Fig. 15.4, s and R < 0, and M < —1; therefore yọ and 4 
have the same sign and are positive when m > 0. Given that the aberrations in 
Eqs. (15.1.8) and (15.1.9) are proportional to m, the only practical choice is 
m=1, 

Both aberrations increase linearly with the wavelength difference. Because 
astigmatism depends on 0, its value is also proportional to 4). The spectral 
resolution is set by the size of TTC at any wavelength. It is convenient to define 
the spectral coma 64, as the spectral width of an image, where 


64, = TTC -P, (15.1.11) 


and the plate factor P = o/ms’ for small $. Multiplying Eq. (15.1.8) by P, we see 
that the spectral coma depends only on the focal ratio of the beam at the grating 
for a given wavelength difference. 

As an example we take a 600groove/mm grating in first order, with 
QO = 500mm and M = —2 for the mirror. We assume a beam diameter of 
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50mm at the mirror, hence F, = 20, and place the grating to give s’ = 545 mm. 
Choosing 4) = 320 nm gives angles 6) = 3.6° and yọ = 7.2°, and at +100 nm 
from the corrected wavelength we find TTC = 31pm, 64, = 0.095 nm and an 
image length of 0.2 mm. Ray traces of this system show that the actual Ay is about 
10% smaller and, with this correction, the spectral coma and astigmatism from 
the ray traces agree with the calculated values. The difference in J) is a 
consequence of the paraxial approximation. 

Unlike the Czerny-Turner monochromator, the grating in the Monk-Gillieson 
cannot be rotated about an axis on its face and maintain focus at a fixed exit slit. 
From Table 14.1 we see that s; is a function of « and £ for a given s. Noting that 
both « and f are variables in the scanning mode, with dx = df, we find in the 
paraxial approximation 


ds, = —2syọ da = —syg(md1/o), (15.1.12) 


where d/ is the wavelength interval scanned. For the example above, ds; = 
—2.05 mm for di = 50 nm. To keep the spectrum in focus on a fixed exit slit, it is 
necessary to simultaneously translate and rotate the grating. It is not difficult to 
show that a rotation about an appropriate axis displaced from the grating face 
gives the required motion. Details on location of this axis are given in the 
reference by Schroeder (1966). 

It is evident from Eq. (15.1.8) that the Monk-Gillieson scanner is limited to 
relatively slow beams to keep the coma small, particularly if a large spectral range 
about the zero-coma wavelength is scanned. A scanning Czerny-Turner or Ebert- 
Fastie is clearly superior in this regard. One attractive feature of the Monk- 
Gillieson is its two reflections, hence the entrance and exit slits are on opposite 
ends of the instrument. 


15.2. PIXEL MATCHING 


Efficient use of any grating spectrometer requires matching the projected slit 
width to the pixel size. According to the Nyquist criterion the optimum match is 
two pixels per projected slit width. If the projected slit is undersampled with less 
than two pixels there is a loss in resolution, while if the slit width is oversampled 
the per pixel signal is less for a given exposure time and readout noise is more 
significant. We now determine the relation between spectral resolving power and 
pixel size for an optimum match. 
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The relevant relations for slit- and seeing-limited cases are Eqs. (13.4.12) and 
(12.2.1a), repeated here for convenience, 


_ 2d, sin ô cos 0 


i 2.4 
Rd Bo cosa (in general) (15.2.1) 
= ZA tan Ô, (Littrow) (15.2.2) 
w = rw h/f) = rọDF,, (15.2.3) 
= 2A (optimum match) 
oF, = 2A/D (optimum match) (15.2.4) 


where F, = f/d, and A is the pixel width. 

To illustrate the utility of these relations, we apply Eqs. (15.2.2) and (15.2.4) to 
an echelle spectrometer in Littrow mode on a 4-m telescope with tand = 2, 
d, = 200mm, and A=20um. The results are #p = 0.2 = 40,000 arc-sec, 
oF, = 2arc-sec. Note that a change to an 8-m telescope, with unchanged 
spectrometer parameters, gives Zp = 0.1 = 20,000 arc-sec, PF, = 1 arc-sec. 
The need for larger gratings with larger telescopes to maintain resolving power 
is evident from a comparison of these numbers. 

A study of these relations, along with the flux-resolution product F 2 in Eq. 
(12.2.13), also shows the importance of better seeing for stellar spectroscopy. The 
product F A is larger for smaller ¢’ (better seeing), hence for a given slit width œ 
the fraction of the light passed by the slit is larger. Alternatively, the slit width ¢ 
can be reduced and larger resolving power achieved. For a smaller ¢@ note also 
that F, is larger for the optimum match. This can be used to advantage in the 
design of a new spectrometer if better average seeing can be ensured. 

As a final item we note that the product @D in Eqs. (15.2.1)-(15.2.4) is 
replaced by 4 for the diffraction-limited case. 


15.3. FAST SPECTROMETERS 


In this section we consider various types of plane grating mountings with 
different collimator and camera configurations. Among the choices for a 
collimator are an off-axis paraboloid, on-axis paraboloid with fold mirror, and 
a well-corrected lens. The choices for a camera include a standard or folded 
Schmidt, a Schmidt-Cassegrain, or a well-corrected lens. Because the focal ratio 
of the camera is typically several times smaller than that of the collimator, we 
choose to call spectrometers of this type fast. 

The discussion in this section is not directed toward the dispersing elements 
between the collimator and camera. Such an element could be a low-order 
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grating, grism, or echelle with prism or grating cross-disperser. The various 
options for a cross-dispersed echelle are discussed in Section 15.5. 

We also note that a field lens near the telescope focal surface may be part of 
any of the collimator-camera options discussed here. A field lens reimages the 
telescope exit pupil onto the dispersing element and may be required if the 
spectrometer is intended for long-slit or multiple-object spectroscopy. 


15.3.a. COLLIMATOR OPTIONS 


The choice of collimator is influenced in part by the focal ratio of the beam 
from the telescope. For slow beams, say f/15 at a Nasmyth focus, a mirror is 
often chosen as the collimator. Aberrations of such a mirror are generally 
negligible and the beam folding may help reduce the overall length of the 
spectrometer. For faster beams, say f/5 —f'/7 at a Cassegrain focus, and fiber- 
fed spectrometers, the trend is toward lens collimators. One exception to this is 
the spherical collimator for Hectochelle, a multiobject echelle spectrometer for 
the f/5 Cassegrain focus of the converted Multiple Mirror Telescope. Details on 
lens collimators are found in the conference proceedings cited at the end of this 
chapter; we consider only mirror collimators in our discussion. 

The choice of a reflecting collimator for a fast spectrometer is generally a 
folded on-axis paraboloid or an off-axis paraboloid. The former is shown 
schematically in Fig. 14.5, the latter in Fig. 15.5. Either type has zero spherical 
aberration, but both have coma and astigmatism at off-axis points on a long slit. 
The off-axis aberrations are of no consequence for a stellar source, but do set a 
limit when extended sources are observed. In this latter case it is important to 
know the size of these aberrations as a function of position on the slit. 

Consider first an on-axis paraboloid without the fold mirror, as shown in Fig. 
15.6. The chief ray at height y on the entrance slit comes from the center of the 
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Fig. 15.5. Off-axis paraboloidal collimator P with angle y between axes of telescope and mirror. 
S, entrance slit of spectrometer at telescope focus. 
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Fig. 15.6. Chief ray from center of telescope exit pupil at angle ọ with telescope axis. S, 
spectrometer entrance slit; C, on-axis collimator mirror. 


telescope exit pupil at angle y with the z axis. The exit pupil is a distance f,6 from 
the slit, where f, is the telescope primary focal length and 6 is given in Eq. 
(2.6.1). The pupil is distance W = —( f,6 + fi) from the collimator, where W < 0 
by the sign convention and f; is the collimator focal length. From Eq. (2.6.4) we 
get y = 6(m/6), where @ is the angle on the sky. 

Substituting these results in the coefficients in Table 5.6, with K = —1 and 
n= 1, we find 


0 mD 0 D 


B, = -— =, B, = — —, ie 
l R ôd, 2 Rd Mai 


where R is the collimator radius of curvature, and d, and D are the diameters of 
the collimator and telescope, respectively. In arriving at Eq. (15.3.1) we used the 
fact that f,6/f; is the ratio of the exit pupil diameter to the collimator diameter. As 
expected, given our discussion in Section 5.5, the coma coefficient does not 
depend on the pupil position. 

The transverse aberrations are found by substituting Eqs. (15.3.1) into Eqs. 
(5.5.9), with s’ replaced by the camera focal length f). The angular aberrations, 
projected on the sky, are found by dividing each transverse aberration by the 
system focal length f, where f, = f(f5/f,|) = FD. With these substitutions we 
get 


8? 
Ansu. aea 


F5 =F (15.3.2) 


where F is the focal ratio of the collimator or telescope. 

As an example, we take parameters for a Cassegrain telescope in Table 6.10 
and evaluate Eqs. (15.3.2) at a field angle of 10 arc-min. The results are 0.28 and 
1.13 arc-sec for the collimator astigmatism and coma, respectively. From these 
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results we see that coma sets the limit on allowable slit length; for typical seeing a 
slit 20 arc-min long gives negligible loss in spatial resolution along the slit. 

Analysis including the telescope aberrations shows that the astigmatism of a 
Ritchey-Chretien is opposite that of the collimator, and the net astigmatism is 
smaller than that from Eq. (15.3.2) by nearly a factor of ten. With a classical 
Cassegrain, collimator and telescope aberrations cancel one another and spatial 
resolution along the slit is determined entirely by seeing. 

An example of a collimator optically similar to that shown in Fig. 15.6 is the 
choice for the low resolution imaging spectrometer (LRIS) of the Keck telescope. 
In this case, however, only an off-axis portion of the full collimator shown in Fig. 
15.6 is used. In effect, the LRIS collimator is an off-axis paraboloid that reimages 
an off-axis portion of the telescope focal surface and whose axis coincides with 
that of the telescope. 

For the off-axis paraboloid shown in Fig. 15.5, y is the angle between the axes 
of the telescope and paraboloid. The usual arrangement has the slit length 
perpendicular to the plane defined by the telescope and paraboloid axes. For 
this collimator we give results derived from ray traces. With y = 10° we find blur 
diameters of approximately 0.5 and 1.5 arc-sec at 0 = 1 and 3 arc-min, respec- 
tively. For this example the total slit length is limited to about 4 arc-min. 

Ray trace results for an off-axis paraboloid show that the blur diameter for 
different y and @ is approximately proportional to the product of the angles. An 
empirical relation for blur diameter is blur (arc-sec) = 0.05y(deg) - @(arc-min). 
This relation is an approximate one, and ray trace results are required for more 
exact measures of blur. The results given here also hold for a slit in the plane 
defined by the axes of the telescope and paraboloid. 

In summary, for slit spectroscopy of extended objects, the on-axis paraboloid 
is the better choice for the collimator, if a slit more than a few arc-min in length or 
a fiber-sampled field more than a few arc-min on a side is required. This is 
achieved at the expense of an additional optical element, the folding flat. The 
echelle spectrograph on the 4-m Mayall telescope at Kitt Peak has an on-axis 
paraboloid collimator; the RC spectrograph on the same telescope has an off-axis 
paraboloid. 


15.3.6. CAMERA OPTIONS 


As pointed out numerous times, spectrometer cameras almost always are fast 
with monochromator focal ratios of two or even less. Capturing all of the 
dispersed light often requires a camera whose clear aperture/focal length ratio 
is one or less. The options for such fast systems include Schmidt and catadioptric 
cameras, as discussed in Chapters 7 and 8, and all-refractive lens systems. For 
spectrometers on telescopes in the 8-m class, the trend is toward lens systems. We 
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do not attempt to discuss the myriad lens systems that have been built, but suggest 
the reader consult the conference proceedings at the end of the chapter. 

Design considerations for a Schmidt-type camera are covered in Chapter 7. As 
noted there, requirements on cameras for direct imaging are different in some 
respects from those for spectrometer cameras. For example, chromatic focal shifts 
in solid and semisolid cameras, intolerable in direct imaging, are accommodated 
in spectrometer cameras by tilting the detector to match the focal surface. 

Spectrometers also require a wider camera in the direction of primary 
dispersion to accommodate beam expansion due to anamorphic magnification 
and to avoid vignetting of the dispersed beams. This is true for both gratings used 
in a single order and cross-dispersed echelles. Unlike the case of a direct camera, 
the pupil in a spectrometer is usually displaced from the corrector, and this means 
a somewhat larger corrector and camera mirror to cover the same field. A way to 
reduce the size of the camera optics in a cross-dispersed echelle is to reimage the 
dispersed light from the echelle onto the cross-dispersing element. This so-called 
white pupil design is discussed in Section 15.5. 

A final significant difference between the modes of operation is the location of 
the focal surface. It is usually internal in a large Schmidt telescope, but almost 
always external in a spectrometer camera. A schematic of a folded Schmidt 
camera is shown in Fig. 15.7, with the fold mirror roughly midway between the 
corrector and spherical mirror. 

A study of the layout in Fig. 15.7 shows that careful placement of the fold 
mirror is required to ensure an efficient camera. It is essential that the detector not 
see collimated light, and this requires that the detector be in the shadow of the 





Fig. 15.7. Folded Schmidt camera. C, corrector plate; F, folding flat; M, spherical mirror. 
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fold mirror, as seen from the corrector. If the distance from the fold mirror vertex 
to detector is large, the size of the hole in the fold mirror must also be large to 
avoid vignetting the beam from the spherical mirror. But the hole in the fold 
mirror should be as small as possible to minimize vignetting of the beam from the 
corrector. The tradeoff between these competing requirements sets the design of a 
folded Schmidt. 

For catadioptric systems of the type discussed in Chapter 8, the size of the 
central obscuration is often a serious problem. For these systems the advantage of 
an external focus must be weighed against the disadvantage of vignetting of the 
central obscuration. 

Lens systems have the advantages of both an external focus and no central 
obscuration, and have become the choice for most large spectrometers. Although 
many of these systems have many separate lens elements to achieve the necessary 
image quality over a wide spectral range, high-efficiency antireflection coatings 
make them nearly as efficient as simpler reflecting cameras. 


15.4. FIBER-FED SPECTROMETERS 


Plane grating instruments have traditionally been used to observe stellar or 
near-stellar sources one at a time with the object centered on the entrance slit. For 
an observing program requiring spectra of many objects of comparable brightness 
in close proximity on the sky, the observing time can be reduced significantly by 
recording the spectra of many sources in the same exposure. As pointed out in 
Section 12.3, this is accomplished by using flexible glass fibers to transfer the 
light from separate sources in the telescope focal plane to the spectrometer slit. 
Each fiber is positioned at a source on one end and aligned along the slit at the 
other end. This technique of multiobject spectroscopy is applicable to objects 
within a cluster of stars or galaxies, as demonstrated by a number of observers. 

As an illustration of the thinking that goes into designing a fiber-fed 
instrument, we take a telescope designed expressly for multiobject spectroscopy 
and show the parameters coming out of a design study. 


15.4.a. A DESIGN EXERCISE 


The telescope and configuration we choose for our design exercise is the 6.5-m 
converted MMT of the Smithsonian Astrophysical Observatory set up as an f/5 
Cassegrain. This focus has a 1° diameter field suitable for fiber-fed spectroscopy 
with one of the instruments, Hectospec, designed for moderate resolving power in 
the range 1000-5000. Some of its optical parameters, as given by Fabricant et al. 
(1994), are provided in Table 15.3. 
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Table 15.3 


Optical Parameters of Hectospec* 


300 fibers, 250 um core diameter, 0.5 mm spacing 


collimator diameter d, 260 mm 
camera focal length f 400 mm 
camera-collimator angle 35° 

pixel size 13.5 ym 


falh = 1/3.25 rfy/f, = 1/345 


°? Hectospec is a moderate-resolution fiber-fed spectro- 
meter on the modified 6.5-m MMT of the Smithsonian 
Astrophysical Observatory. 


Of the several grating configurations of Hectospec, we choose the one with a 
270 groove/mm grating and a blaze angle 6 = 5.3°. To simplify our exercise we 
choose Eq. (15.2.2) rather than the more accurate Eq. (15.2.1); for this grating the 
error made by this simplification is only a few percent. Substituting d, and A 
from Table 15.3 into Eqs. (15.2.2) and (15.2.4) we get 


Rd = 0.0074 = 1500 arc-sec, oF, = 0.86 arc-sec. (15.4.1) 


With a telescope scale of approximately 160 um/arc-sec, each fiber subtends an 
angle of about 1.5 arc-sec on the sky. Putting this value into Eq. (15.4.1) gives 
2 = 1000 and F, = 0.57. This value of F, for an optimum pixel match is 
obviously not feasible; the choice made by the system designers is F, = 1.5. 
Hence each projected fiber end spans about 5.5 pixels. 

This exercise indicates some of the tradeoffs made by the system designers for 
this instrument. The fiber size chosen is a tradeoff between ease of positioning 
and maximum light from a distant galaxy, better with a large fiber, and resolving 
power, larger with a small fiber. There is also the obvious tradeoff between pixel 
matching and camera focal ratio, a choice faced by nearly all designers of 
spectrometers for large telescopes. 


15.5. ECHELLE SPECTROMETERS 


For high-resolution astronomical spectroscopy, R ~% 4E4 or larger, an echelle 
is the choice over a grating used in low order. The principal reasons for this are 
larger luminosity, discussed in Section 13.3, and the 2D format of the spectrum, 
which permits broad spectral coverage on efficient detectors. Because the echelle 
has a large groove spacing it is used at high-order numbers, as the example to 
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follow illustrates. Thus it is necessary to provide cross-dispersion to separate the 
orders, or use a filter to isolate a single order. 

In this section we discuss the form of the 2D format with different cross- 
dispersers and possible locations of a cross-disperser within an echelle spectro- 
meter. We give a design exercise to illustrate the choice of parameters appropriate 
for an echelle instrument on a 4-m telescope. Finally, we point out some of the 
different collimator and camera possibilities selected for echelle spectrometers on 
large telescopes. 


15.5.a. SPECTRUM FORMATS 


If wavelengths in different orders are to be recorded without confusion, a 
cross-disperser must be put in series with the echelle. A cross-disperser is simply 
another element, usually a prism or another grating, whose dispersion is at right 
angles to that of the echelle and whose function is to separate the orders. The 
angular dispersion of the order separator is usually many times smaller, and the 
combination of elements gives a 2D spectrum format. The format outline is set by 
the relative dispersions of the two elements; in this section we discuss the factors 
that determine a spectrum format. 

We consider first the factors that determine the length of the spectrum in a 
given echelle order. With a camera of focal length f,, the spectrum in the focal 
plane has length f AB, where Af is the angular length of one free spectral range 
AA. Combining Eqs. (13.2.2a) and (13.2.6) we get 


aay peel 


ApS ~ ocos hy’ 





(15.5.1) 


where A, is the blaze wavelength in the mth order and f, is its angle of 
diffraction. This relation is not exact but for m > 10 is a good approximation 
to the exact angular width. The free spectral range within this Af is 
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(15.5.2) 
As noted in Section 13.3, AÀ is the spectral range between the approximate half- 
intensity points of the blaze function; it is also the separation between blaze 
wavelengths in adjacent orders. For the echelle example to follow, Af = 4.33° 
and A4 = 11.1 nm with 0 = 5° in order m = 45. 

From Egs. (15.5.1) and (15.5.2) it is evident that ø is the controlling parameter 
for a given blaze angle and wavelength. Recall also that the parameters that set 
the resolving power are the blaze angle and diameters of the collimator and 
telescope. Because the camera design depends on both beam diameter and focal 
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Table 15.4 
Equations for Echelle at Blaze Peak 


md, = o(sin f, + sina) = 2a sin ô cos 0 
Ba =5-8, a=d+0 
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length, it is clear that the order length f} Af also depends on parameters other than 
the groove spacing. 

For convenient reference, we give the important relations for an echelle at the 
blaze peak in Table 15.4, including those for order length and free spectral range. 
Note that angular dispersion and resolving power, as given in general in Table 
13.1, are not constant over a single order. Their values at the blaze peak in Table 
15.4 are essentially an average over each order. 

Assuming a cross-disperser with angular dispersion 4., the separation Ay 
between adjacent orders is given by Ay = f4, AA. If A and A, are assumed 
constant over a free spectral range, that echelle order is tilted by an angle y with 
respect to the direction of echelle dispersion, where tan = A,/A. This tilt of 
orders is shown schematically in Fig. 15.8. 
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Fig. 15.8. Tilt of echelle orders relative to directions of echelle dispersion (E) and cross dispersion 


(C). 
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We consider first a grating cross-disperser with 4, = m,/o,cos P, from Eq. 
(13.3.2a). Although 4, is essentially constant over a typical echelle order, A for 
the echelle increases as f increases, hence it is not constant along an order. 
Typically tan y changes by a few percent from one end to the other and the order 
is slightly curved. At the blaze peak of the echelle the tilt of the echelle orders 
with a cross-dispersing grating is given by 


_ m A, cos Bp 
o,cos Be 2 sind cos 0 





tan Y = constant - /,. (15.5.3) 


Given the angular dispersion A, of the grating, we can write the order separation 
Ay in terms of the blaze wavelength. The result is 


me h 
o, cos P, 20 sin ô cos 0 





Ay (grating) = f = Chh, (15.5.4) 
where C is constant for a given echelle-grating combination. 

For a prism cross-disperser, 4, is approximately proportional to 4-3, as shown 
in Section 3.2. The combination of changing dispersions for both the prism and 
echelle along a given order results in more significant curvature for each order. In 
the case of a prism cross-disperser the tilt of an order and the separation between 
orders, both at the blaze peak, are given by 


tany ~% constant - ae, (15.5.5) 
Ay (prism) © CA5". (15.5.6) 


Comparing Eqs. (15.5.4) and (15.5.6) we see that gratings give larger order 
separation at longer wavelengths, while the reverse is true for prisms. When 
detector area is limited, a prism cross-disperser makes better use of the available 
area because y changes less rapidly with wavelength. 

To illustrate the formats given by different cross-dispersers, we take a 
31.6 groove/mm echelle with tan ô = 2 and 0 = 5°. For a grating cross-disperser, 
we assume one with 158 grooves/mm used in first order; for a prism cross- 
disperser we assume two 45° UBK7 prisms in series. The formats, to the same 
scale, are shown in Figs. 15.9 and 15.10 for the grating and prism, respectively. 
Each pattern has 66 orders over a wavelength range from 400 to 750nm, with 
m = 141 and 76 at these wavelengths, respectively. 

The linear size of the formats shown in Figs. 15.9 and 15.10 is proportional to 
the camera focal length, as is the linear separation between orders. The angular 
separation between orders, projected on the sky, is proportional to the beam 
diameter and is independent of the camera focal length. We give numerical values 
for these formats in our discussion of a particular echelle spectrometer design to 
follow. 
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Fig. 15.9. Echelle format with grating cross-disperser. See text, Section 15.5.a, for parameter 
values of echelle and grating. 
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Fig. 15.10. Echelle format with prism cross-disperser. See text, Section 15.5.a, for parameter 
values of echelle and prism train. 


15.5.b. CROSS-DISPERSION MODES 


Several different methods of order separation are possible. The simplest way to 
suppress unwanted orders is with a narrowband filter between the telescope and 
spectrometer. Cross-dispersion can also be done with a separate prism or grating 
spectrometer following an echelle instrument, but such a system is less efficient 
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and more prone to misalignment than an echelle spectrometer with internal cross- 
dispersion. We choose to limit our discussion to internal cross-dispersion. 

For internal cross-dispersion modes, the three principal options are a prism (or 
prism train) or plane grating located: 


(1) between the echelle and camera optics and used single-pass; 
(2) between the collimator and echelle and used single-pass; or 
(3) close to the echelle and used double-pass. 


It is possible to devise an echelle instrument with either a concave grating used 
as both camera and cross-disperser or a concave grating predisperser also 
doubling as the collimator. Configurations with concave gratings, however, are 
not practical in the fast systems required for large telescopes. 

In addition to the possible locations of the cross-disperser noted in the 
preceding text, the choice of angles of the chief ray to and from the echelle is 
an important part in the design of an echelle spectrometer. Although we make 
reference to angles in this section, we defer the main discussion of this topic to 
the following section. 

We first consider mode (1), in which the disperser follows the echelle, oriented 
with « > B, as shown schematically in Fig. 15.11. The direction of the echelle 
dispersion is in the plane of Fig. 15.11, and the prism or grating must be clear of 
the collimator beam and large enough to accept the dispersed light in each echelle 
order. In the direction perpendicular to the echelle dispersion, the width of the 
cross-disperser is the diameter of the collimator beam. 

It is evident from Fig. 15.11 that the cross-disperser can be placed closer to the 
echelle if the angle 0 is larger, hence the height of the cross-disperser is less. 
Larger 0, however, also means a larger dispersed beam height because of 
anamorphic magnification. The latter effect largely cancels the reduction in 
size obtained by putting the cross-disperser closer to the echelle; the net effect 





Fig. 15.11. Echelle (E) with postdisperser (CD), denoted as mode (1) in text, in off-plane 
configuration. 
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is that the size of the cross-disperser depends only weakly on the choice of 9, 
except for 0 near zero. 

We pointed out in Section 13.4 that the efficiency at the blaze peak and the 
average efficiency over each order decreases as 0 increases. In order to keep the 
efficiency as high as possible, 9 is chosen as small as possible, within the 
constraint of having the dispersed beam clear the collimator beam. Given these 
competing effects, it turns out that @ in the range 4—6° is a good compromise 
between beam clearance and efficiency for this arrangement. 

A variant of mode (1) is one in which « = B, hence 8 = 0, but y 4 0, as shown 
in Fig. 15.12. We discuss the important features of this choice of angles in a 
following section. 

The option listed here as mode (2) has the cross-disperser located in the 
collimator beam and used single-pass, as shown schematically in Fig. 15.13. In 
this mode the required size of the disperser is simply that of the collimator beam 
and the discussion for mode (1) relative to anamorphic magification applies to the 
size of the camera optics. An example of an echelle spectrometer with this mode 
of cross-dispersion is the University College London Echelle Spectrograph 


I 


Fig. 15.12. Echelle (E) with postdisperser (CD) in quasi-Littrow or in-plane configuration. 
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Fig. 15.13. Echelle (E) with predisperser (CD) in collimator beam. 
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(UCLES) for the Anglo-Australian Telescope (AAT). The disperser in this 
instrument is a train of three fused silica prisms. 

The final cross-disperser option is mode (3), a prism(s) or transmission grating 
close to the echelle and used double-pass. Of the two choices of disperser for this 
mode, the prism is the only viable choice because of its constant efficiency. The 
efficiency curve for a transmission grating is sharply peaked if double-passed and 
this option cannot compete with a prism. An obvious advantage of a double-pass 
prism(s) is larger-order separation compared with a single-pass prism as used in 
mode (1). An example of an echelle spectrometer with this type of cross-disperser 
is an instrument for the 2.7-m telescope at the McDonald Observatory, Texas. 

A double-pass prism arrangement is an attractive option for quasi-Littrow 
configurations in which a = f = 6. As pointed out in the discussion of Table 13.4 
in Section 13.4, echelles approaching R-4 are almost forced into the quasi-Littrow 
mode because of problems with echelle size and anamorphic magnification. 

Of the three modes discussed in this section the one most often chosen is mode 
(1). This mode has the advantage of allowing either gratings or prisms to be used 
and is not subject to tilt of the projected slit on the detector. We discuss this latter 
effect in Section 15.5.d. 


15.5.c. GRATINGS VERSUS PRISMS AS CROSS-DISPERSERS 


The choice of a grating versus a prism depends on many factors including 
required order separation, spectral range coverage, transmission efficiency, 
efficient use of detector area, and cost. Gratings usually have higher dispersion 
and, if a camera with a short focal length is used, may be the only practical option. 
One disadvantage of a grating is its changing efficiency over a wide spectral 
range, as shown in Fig. 13.10, and thus several may be needed. Another 
disadvantage of a grating is the less than optimum use of detector area, as 
shown by Eq. (15.5.4). The main advantage of gratings is a wide selection of 
available groove spacings to give a range of cross-dispersions. 

A prism has high and constant efficiency over its range of transmittance, but 
lower dispersion than a grating, hence more than one prism may be required to get 
the necessary order separation in a spectrometer with a fast camera. Large prisms 
are generally more expensive than gratings of comparable size, and the choice of 
transparent glasses for near ultraviolet wavelengths, especially to the atmospheric 
cutoff, is limited. A significant advantage of a prism over a grating is that it gives 
a format that makes better use of detector area. 

It is interesting to note the history of the choice of cross-disperser. Gratings 
were the choice for the first echelle spectrometers built in the 1960s and 1970s. 
These instruments typically had beam diameters of 100mm or less, and used 
image tubes or photographic plates. Almost without exception, echelle instru- 


392 15. Plane Grating Spectrometers 


ments designed and built in the 1980s used prism cross-dispersion. These systems 
typically had beam diameters of approximately 200 mm and use CCD detectors. 
For the largest echelle spectrometer in operation at the time of this writing, 
HIRES at the Keck 10-m telescope, the beam diameter is 300 mm and a grating 
cross-disperser is used. For this instrument both the grating and echelle are 
mosaics, as the size of single prisms required for this beam size precluded their 
use. Walker et al. (1994) have suggested prism mosaic configurations as cross- 
dispersers for beams of this size. 


15.5.d. ECHELLE SPECTROMETER CONFIGURATIONS 


Echelle spectrometers have been built in many different configurations, but 
each is a variant of one of two choices of angles of the chief ray relative to the 
echelle. One choice is the so-called in-plane design in which y = 0. In this design 
the collimator and camera beams are separated by choosing « > $. The other 
choice is the off-plane design with y # 0. Choosing a = ß gives a quasi-Littrow 
design, hence no anamorphic magnification in the direction of echelle dispersion. 
Figures 15.11 and 15.12 show off-plane and in-plane schematic layouts, respec- 
tively. A summary of the characteristics and modes for selected echelle spectro- 
meters is given in Table 15.5. 


Table 15.5 


Characteristics and Modes of Selected Echelle Spectrometers 





R-value W (mm) Mode Cross-Disperser 
In-Plane Designs 
HIRES? 2.8 1260 l Gratings 
Hectochelle’ 2.1 840 filter - 
HROS! 2.0 410 2 Prisms 
immersed echelle 
CARCES4 2.0 410 1 Prisms 
Off-Plane Designs 
HDS? 2.8 840 1 Gratings 
HRS! 3.8 840 1 Gratings 
white-pupil 


* High Resolution Echelle Spectrograph—Keck 10-m Telescope. 

è Echelle Spectrograph—Modified 6.5-m Multiple Mirror Telescope. 

© High-Resolution Optical Spectrograph—Gemini 8-m Telescope. 

4 Chicago ARC Echelle Spectrograph—Apache Point 3.5-m Telescope. 
€ High Dispersion Spectrograph—Subaru 8.2-m Telescope. 

/ High Resolution Spectrograph—Hobby-Eberly 9-m Telescope. 
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A consequence of nonzero y in the off-plane design is that the entrance slit is 
reimaged by the spectrometer optics with a tilt, as discussed in Section 14.1, with 
the tilt proportional to y and the slope given by Eq. (14.1.9). Substituting 4 from 
Table 15.4 into Eq. (14.1.9), the slope or tilt of the reimaged slit is, to a good 
approximation, df/dy = 2 tan y tan ô. This tilt is of little consequence for a single 
point source at the entrance slit, but must be taken into account in data reduction 
for a long slit. 

The choices for collimator and camera optics are about as varied as the number 
of operating echelle spectrometers; examples of some of these choices are listed 
in Table 15.6. Details on the designs of these spectrometers, and versions based 
on a modified Czerny-Turner arrangement, are found in the references at the end 
of the chapter. 

One optical layout of echelle spectrometers not yet discussed is the so-called 
white-pupil design. In this design additional optics between the echelle and the 
cross-disperser in mode (1) reimage the echelle onto the cross-disperser with unit 
magnification. Thus the beam size at the prism or grating is the same as that of a 
monochromatic beam emerging from the echelle and the extra cross-disperser 
size needed to capture the dispersed light in each echelle order in an in-plane 
design is not required. Examples of designs that incorporate the white-pupil 
concept are the UV-Visual Echelle Spectrograph (UVES) for the ESO Very Large 
Telescope (VLT) and the fiber-fed High Resolution Spectrograph (HRS) for the 
Hobby-Eberly Telescope designed by Tull (1994). Both of these instruments use 
an R-4 echelle to get the largest possible 2d product from existing echelles. A 
schematic layout of a white-pupil design is shown in Fig. 15.14. 

A final approach to increasing the #@ product is to use an immersed echelle, 
as discussed in Section 13.3. This is the choice for the high-resolution optical 
spectrograph (HROS) for a Gemini 8-m Telescope. 


Table 15.6 


Optics of Selected Echelle Spectrometers? 


Focus Collimator Camera 
HIRES Nasmyth f/14 Tilted Sphere Catadioptric 
Hectochelle Bench? Off-axis Paraboloid Catadioptric 
HROS Cassegrain f/\6 Paraboloid Catadioptric 
CARCES Nasmyth f/10 Paraboloid Schmidt 
HDS Nasmyth J/13 Paraboloid Catadioptric 


HRS Bench? Off-axis Paraboloid All-Refractive 


* See Table 15.5 for definitions of acronyms. 
? Bench spectrometers are fiber-fed. 
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M1 M2 


WP 


Fig. 15.14. Schematic layout of intermediate optics for white-pupil spectrometer design. 
Dispersed light from the echelle (E) is recollimated by the M1, M2 mirror pair and the pupil at the 
echelle is reimaged at the white pupil (WP). 


15.5.e. ECHELLE DESIGN EXERCISE 


To illustrate some of the considerations that go into the design of an echelle 
spectrometer, we take some specific parameters and determine the characteristics 
of the instrument and spectrum format. The assumed parameters include 
a telescope diameter D=4m and an R-2 (ô = 63.5°) echelle with 31.6 
grooves/mm used at 0 = 5°. We aim for a system capable of spectral resolving 
power 2 = 5E4 when the slit width ¢@ = 1 arc-sec and the projected slit width 
w = 30 um. 

Defining F = 2 sin ô cos 0/ cos x, we can rewrite 2 from Table 15.3 and Eq. 
(15.2.3) in suitable units as 





i d,(mm) 
Ae (arc-sec) = 2007 D) ` (15.5.7) 
w'(um) = 5rọ (arc-sec)D(m)F>, (15.5.8) 


where T = 4.87 and r = 0.70 for the echelle angles chosen. Solving for d, and 
F, gives dy =205mm and F,=2.14, hence the camera focal length 
fa = 440mm. From these results we see that the only possible instrument 
configuration is one with a fast camera, such as a folded Schmidt. The projected 
slit width is covered by two pixels if the pixel size A = 15 um. 

Note that the results derived so far are independent of the echelle groove 
spacing; this parameter is used to find the length f AB of each echelle order. With 
our chosen echelle, the order lengths are 10.7 and 19.8 mm, respectively, at 400 
and 750 nm, with corresponding plate factors of 2.67 and 5.0 A/mm. 

The order separation and format width is determined by the cross-disperser, so 
far unspecified. If a first-order grating with 158 grooves/mm is used, the orders at 
400 and 750 nm are separated by 23.3 mm; if two 45° UBK7 prisms are used, the 
orders at these wavelengths are separated by 15.7mm. The echelle and cross- 
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disperser parameters used in this example are the same as those used for the 
formats shown in Figs. 15.9 and 15.10. 

The spacings of the orders within the formats depend on the wavelength 
according to Eqs. (15.5.4) and (15.5.6). For the grating the angular separations 
between adjacent orders, projected on the sky, are 4.7 and 16.1 arc-sec at 400 and 
750 nm, respectively. The corresponding separations for the prism cross-disperser 
are 7.6 and 4.5 arc-sec, with smaller separation at the long wavelength end of the 
format. 

Our calculated parameters for this example assumed a 4-m telescope, 2 = 
50,000 and ġ = | arc-sec. Suppose we want a spectrometer with the same # and 
@ on an 8-m telescope. For the same T we find d) = 410mm from Eq. (15.5.7). 
Given the largest echelle width of 305 mm now available, the only viable way to 
reach the desired resolving power is an echelle with a larger tan ô. If, for example, 
we choose a quasi-Littrow configuration with an R-4 echelle, then we find = 2 
tan ô = 8 and d, = 250mm. 

In addition to the beam and echelle size problem, there is also a camera 
problem. From Eq. (15.5.8) we find dF, = 0.75 arc-sec when w’ = 30 um for our 
assumed quasi-Littrow echelle on an 8-m telescope. This projected slit width was 
chosen to satisfy the Nyquist sampling criterion with 15 um pixels, but that is no 
longer possible for an 8-m telescope unles the slit width is reduced. If we choose 
w = 60 um, then F, = 1.5 is a reasonable choice for the camera if the require- 
ment ġ = l arc-sec is kept. 

Although these examples for a 4- and 8-m telescope illustrate the approach in 
the design of particular echelle instruments, the same procedures apply to the 
design of any echelle or grating spectrometer. Once the basic outline of a design 
is found, the choice of an optical system that fits this outline can be made. 

Calculated values of #@ for selected echelle spectrometer and telescope 
combinations are found in Table 15.7. 


Table 15.7 


Resolution-Slit Width Products of Selected Echelle Spectrometers? 





dı (mm) Jı (mm) ô (°) CC) Rd (arc-sec) 
HIRES 300 760 70.5 5 45,000 
Hectochelle 210 620 64.5 7.5 46,000 
HROS? 160 63.5 29,000 
CARCES 200 540 63.5 6 60,000 
HDS 270 770 70.5 6 38,000 
HRS 210 500 75 1 35,000 


*¢ = 0 for in-plane designs; € = y and 0 = 0 for off-plane designs. 
’ Mf product includes gain factor of 1.46 from immersed echelle. 
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15.5.f. CONCLUDING REMARKS 


Although our discussion of echelle formats and spectrometer configurations is 
given in terms of echelles with tan 6 = 2 or larger, the relations used apply to any 
blaze angle. An echellette grating can also be used with a cross disperser to 
separate orders. Consider, for example, a grating with 300 grooves/mm and 
tan ô = 0.75. From the grating equation in Table 15.4 we get mA, (um) = 4 cos 0, 
hence most of the visible spectrum is covered in four orders, m = 6 through 
m = 9. These orders are easily separated with a crossed prism in the configuration 
shown in Fig. 15.11. Because of the smaller sin ô compared to an echelle, such a 
system is appropriate for medium spectral resolution over a wide wavelength 
range in a 2D format. 

The range of possibilities for echelle spectrometer designs is impressive, as 
shown by the data in Tables 15.5-15.7. Instruments such as these, either fiber-fed 
or with direct slit illumination, are transforming the field of high-resolution 
astronomical spectroscopy. Our discussion is only an introduction to a large 
subject area, and the interested reader should study the many papers describing 
these instruments in detail. 


15.6. NONOBJECTIVE SLITLESS SPECTROMETERS 


An important technique for low-dispersion spectroscopy with large telescopes 
is the nonobjective mode. This mode is one in which a dispersing element, prism, 
blazed transmission grating, or a combination of the two, is placed in the 
converging beam near the telescope focal surface, as shown in Fig. 12.3. We 
discuss the characteristics of each of these in the nonobjective mode. The plate 
factor P and spectral purity 64 for this mode are given by Eqs. (12.1.3b) and 
(12.4.1), respectively, and are repeated here for convenient reference. 


P=(sA)"', (15.6.1) 
A= as 64, =TTC -P (15.6.2) 
S 


where s is the distance from the element to the focus, f is the telescope focal 
length, A is the angular dispersion, and 6A, is the spectral coma as defined in Eq. 
(15.1.11). 

The term “nonobjective” is used to distinguish this type of slitless instrument 
from the classical objective mode in which a prism or grating covers the aperture 
of a telescope. The disperser in the objective mode is in collimated light and the 
spectral resolution is determined by the seeing or telescope aberrations. The 
discussion in Sections 12.4 and 13.1 is sufficient for the objective mode. 
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The advantages of the nonobjective mode include ease of mounting a disperser 
on any telescope with a minimum of effort and cost, as no auxiliary optics are 
needed. Hence slitless spectroscopy is not limited to telescopes of modest size. 
Another advantage is that within broad limits set by aberrations and disperser size 
any plate factor is possible. A disadvantage of this mode is that aberrations are 
present when the disperser is placed in a converging beam but, as we show, their 
effect can often be reduced to a negligible level compared to the seeing limit. It is 
also important to note that this mode is not an alternative to the objective mode, 
but is complementary. The objective mode is typically used to give P in the range 
of 10-30nm/mm, while the nonobjective mode is suitable for larger plate factors. 

We now consider in turn the aberrations of the prism, grating, grism, and 
prism-grating in a converging beam. In each case we apply the paraxial 
approximation to all angles, an assumption that is justified for all practical 
configurations. Any significant deviations from results so derived are noted. 


15.6.4. NONOBJECTIVE PRISM 


Although a prism or wedge is rarely used in this mode, its characteristics are 
important when combined with a grating. Thus we determine prism aberrations in 
anticipation of the discussion of a grism. 

Consider a thin prism of index N with apex angle y, as shown in Fig. 15.15, 
and angles of incidence @, and 0, at the first and second surfaces, respectively. It 
is convenient to express these angles in terms of the apex angle. If 0, = ey then, 
from Snell’s law, we find 6, = —y(1 — ¢/N). The parameter ¢ determines the 
prism orientation with respect to the chief ray; when ¢ = 0 the incident chief ray 
is perpendicular to the first surface, and when s= N/2 the prism is set for 
minimum deviation. 

The pertinent aberration coefficients for each surface are those of astigmatism 
and coma, and from Table 5.1 we get 


(N? — 1) pe 05N(N2 — 1) 


Ay=- ‘ ; 15.6.3 
0 (N? — 1) 0,N(N? — 1) 
Ay, = O 2M Az = De (15.6.4) 


where sı and s, are the object distances at the first and second surfaces, 
respectively. The astigmatism coefficients are reversed in sign from those in 
Table 5.1 to reflect the change from sagittal to tangential image, as discussed in 
Section 15.1.a. 

Assuming the prism thickness is small compared to the distance to the focal 
surface we have s, = Ns,. This assumption, in turn, implies that the beam size is 
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Fig. 15.15. Cross section of prism of index N with apex angle y. 0, and @ are angles of incidence 
at the first and second faces, respectively. 


the same at the two surfaces, hence each prism aberration coefficient is simply the 
sum of surface coefficients. Substituting for 0, and 0,, with s} = s, we find 


2 2 
yN? — 1) 2e 
ead cha Se eee 15.6.5 
Aw 2s ( =) (15.6.9) 
_ 9(N? = 1) 
Any — 2N , (15.6.6) 


where the subscript w denotes a wedge or a prism. Note that the coma coefficient 
is independent of £, hence coma does not depend on the prism orientation. For the 
astigmatism coefficient we see that it is zero when ¢ = N /2, the prism orientation 
at minimum deviation. 

An analysis including the wedge thickness shows that, to a good approxima- 
tion, the wedge coma coefficient is the sum of Eqs. (15.6.6) and (7.2.11), where 
the latter is the coefficient for a plate of thickness ¢. The contribution of the 
thickness term is smaller than that of Eq. (15.6.6) by a factor of et/N?s, and can 


be ignored. 
The transverse and spectral coma are given by 
3y(N? — 1) 
TTC = 34 = —__— 5, 15.6. 
Xs ae (15.6.7) 
3(N? — 1) då 
OA. oraes 8NF? _ dN , (15.6.8) 


where F is the focal ratio of the converging beam, and the angular dispersion A of 
a thin prism is y dN /då. Note that the spectral coma is independent of the prism 
angle and distance from the focal surface. 

As an example, consider a thin UBK7 prism in an f/8 beam. At a wavelength 
of 400 nm, N = 1.53 and, with dN /d/ from Fig. 13.1, we get ôA, = 40 nm. Thus 
a prism in the nonobjective mode is useful only for very low resolution, of order 
10 in this example. 


15.6. Nonobjective Slitless Spectrometers 399 


FP) \FS 


gj A 
N 


Fig. 15.16. Tilt of focal surface FS, shift from nominal telescope focal plane FP, due to prism in 
converging beam. 


A final feature to note for a prism is the tilt of the focal surface. Figure 15.16 
shows chief rays from sources in different parts of the field passing through 
different prism thicknesses. Neglecting the prism angle, we use Eq. (2.4.5) to find 
the shift in focus from the nominal telescope focal surface. Because the shift in 
focus is proportional to ¢, the average local prism thickness, the surface on which 
the spectrum is imaged is tilted with respect to the telescope focal surface. Using 
Eq. (2.4.5) we find that the angle 6 between these two surfaces, in terms of the 
prism angle, is given by 6 =y(N —1)/N. This relation is relevant to our 
discussions of the grism and prism-grating. 


15.6.6. NONOBJECTIVE TRANSMISSION GRATING 


We now consider a transmission grating in a converging beam at distance s 
from the telescope focal surface. The thickness of the grating blank contributes to 
the aberrations only if the grating face is not normal to the incident chief ray. 
Compared to the grating aberrations, the contribution of the blank thickness is 
roughly t/3s smaller for practical tilt angles and is ignored. 

The relations for this mode are those for the Monk-Gillieson mounting in 
Tables 14.1-14.4, with the appropriate sign for a transmission grating. In the 
paraxial approximation these are 


Aig = (P — &°)/2s, (15.6.9) 
Ang = (B — &)/2s* = (1/2s")ma/o, (15.6.10) 
K, = —3/s, (15.6.11) 
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where the grating equation is used to rewrite the coma coefficient in Eq. 
(15.6.10). Considering first the coma, we find 


3s må 3A 
= == —=sos 6.12 
TTC = 34 —¥'s = 2 = gpa (15.6.12) 


ôl. =34/8F*, R=8F’/3, (15.6.13) 


where F is the focal ratio of the converging beam, and 2 is the spectral resolving 
power. Note that the spectral coma and resolving power are independent of the 
grating parameters and distance to the focal surface. 

As an example, a grating in an f/8 beam has 6/1, = 2.3 nm at a wavelength of 
400 nm, with 2 = 170. Compared to the preceding prism example, the improve- 
ment in spectral resolution is indeed substantial. Given the size of the spectral 
coma, a grating in this mode can be used at significantly higher dispersion or 
lower plate factor than a nonobjective prism. 

The spectral resolution achievable with a nonobjective grating is set by 
spectral coma if the dispersion is large and by seeing if the dispersion is small. 
The boundary between these is found by setting 64, = 6A, where the latter is 
given in Eq. (15.6.2). 

Solving this relation for P gives, in the units specified, 


P(nm/mm) = 75A(nm)/F? D(m)¢ (are-sec), (15.6.14) 


where D is the telescope diameter. The seeing blur is larger than the coma blur for 
any P larger than that given by Eq. (15.6.14). Results from this relation are 
plotted in Fig. 15.17 for a selected set of focal ratios. 

The constraint put on the plate factor by this relation is a conservative one 
because the entire width of the comatic image is used in the spectral coma. 
Because about 80% of the light in a comatic image is within a width TTC/2, a 
plate factor limit that is one-half that given in Eq. (15.6.14) and Fig. 15.17 is 
somewhat more realistic. 

We now determine the characteristics of the astigmatic image. It is evident 
from Eq. (15.6.9) that the astigmatism can be made zero at a wavelength A, by 
setting f = —a, hence 2a = —mA/c. If the grating is tilted by angle «, however, 
the detector is also at angle a with respect to the grating surface. As a result of this 
tilt the plate factor now varies across the detector, and the fractional change in P 
across a detector of width W is |a|W/s. 

The transverse astigmatism is 

(Ba?) _ A= Ay) 


TAS = 9A yh z , 15.6.15 
gS 2F ~~ 2FP2s (a63) 





where the image length is 2 - TAS. For a given P and F we see that TAS varies 
inversely as s. To keep P constant for a larger s means choosing a grating with 
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Fig. 15.17. Plate factor P at which spectral coma equals seeing blur of 1 arc-sec for nonobjective 
transmission grating. See Eq. (15.6.14). 


fewer grooves per millimeter. Alternatively, for a given grating at different 
distances from the focal surface, the size of TAS is directly proportional to s. 

As an example, take a first-order grating with 150 grooves/mm in an f/8 
beam. This grating has a plate factor of 100nm/mm when s is 67mm. At 
wavelengths of 400 and 600 nm, respectively, the astigmatic image lengths are 30 
and 68 um when « = 0. If Ay is 500 nm, hence « = —2.15°, the image lengths 
are 7 and 114m at 400 and 600nm, respectively. Choosing a grating with 
75 grooves/mm and letting s = 133 mm reduces the image lengths by a factor of 
two. Although astigmatism does not affect spectral resolution, it is evident that it 
is desirable to keep the image lengths short to maintain spectrographic speed. 

From Eq. (15.6.11) we see that the spectrum of each source has its own curved 
surface with radius s/3, with the surface concave as seen from the grating. 
Because all of the spectra are recorded on a flat detector, this curvature results in a 
defocus blur and can degrade the spectral resolution. 

The image surfaces for the examples with « = 0 and a = —2.15° are shown in 
Fig. 15.18, with a line for each representing the optimum location of a detector 
for the 400—600 nm range. It is evident from Fig. 15.18 that the detector “fits” the 
image surface somewhat better for the tilted grating, with the zero-order image 
also in better focus. Simple geometry can be used in each case to determine the 
defocus blur at the ends of the spectral range. 

A final important feature of the nonobjective grating mode is the presence of a 
zero-order reference for each spectrum, a reference not present for a prism in any 
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Fig. 15.18. Tilted image surfaces for nonobjective transmission mode with P = 100nm/mm. 
Horizontal scale is stretched by 2.5 times relative to vertical scale. Zero order and spectrum are in 
focus on surface with radius of curvature 22 mm. 


slitless mode. This is a significant advantage in that quantitative measures of line 
positions are now possible. This is important, for example, for approximate 
measures of redshifts in emission-line objects. For a typical blazed grating with 
P = 100nm/mm, it turns out that the brightness in the zero order is comparable 
to that in the dispersed spectrum. Examples of such spectra are shown in the 
reference by Hoag and Schroeder (1970). 


15.6.c. NONOBJECTIVE GRISM 


The principal defects in spectra taken with a nonobjective grating are coma 
across the spectral range and defocus at the ends of the range. These defects are 
significantly reduced with a grism, a prism with a grating replicated on one of its 
faces. The characteristics of a grism in convergent light were first described by 
Bowen and Vaughan (1973); in our discussion we reproduce the main results 
from their treatment. 

A cross-section of a grism is shown in Fig. 15.19, with the same notation for 
the angles as in Fig. 15.15. We again assume the paraxial approximation and 
neglect the prism thickness. In this approximation, the aberration coefficients of 
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Fig. 15.19. Cross section of grism with apex angle y. 0, is the angle of incidence at the first face; 
the grating is on the second face. 


the grism are the sum of the corresponding coefficients of the prism and grating. 
Taking results from the previous sections, we get 





2(N2 — 1 2e 2g? 
ji Nha 


= = =, (15.6.16) 
1 [ma N? —1 
„= E J) = 2, (15.6.17) 


The coma coefficient is zero for A = Ay when the prism angle y is given by 





S aE (15.6.18) 
and therefore 
3s m 3|AA| 
= —; —|AA| = == 5.6.19 
a 8F2 z. | 8F2P’ (1 ) 


where AA = 4 — Ay. Comparing the spectral coma for a grism from Eq. (15.6.19) 
with that for a grating given in Eq. (15.6.13) we see that the coma for the grism is 
several times smaller. This implies, in turn, that a grism can be used at a plate 
factor that is smaller by the same amount, or at a faster focal ratio, before the 
coma and seeing blurs are equal. 

The direction of the diffracted chief ray at the zero-coma wavelength is shown 
in Fig. 15.19. In this direction the dispersions of the prism and grating add; the 
grism dispersion is typically a few percent larger than that of the grating alone. 
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The astigmatism coefficient, unlike the coma, depends on € and the grism 
orientation. With the grating equation, mA = o(B — «), and a = y(e — N), we can 
rewrite Eq. (15.6.16) as 


_ P(N? -1) 
2N?s 


where ¢ = À/àọ. Note that A,, is independent of € when ¢ = 1. Setting 
dA,,/d¢ = 0 and evaluating at ¢ = 1 gives e = 1/N. With this choice of ¢ we 
have astigmatism constant near A) and, to a good approximation, constant over a 
significant range centered on J). Note that $ = 0 at the corrected wavelength for 
this grism orientation. We show shortly that this choice of è also significantly 
reduces defocus due to image surface curvature. 

With ¢ = 1/N, substitution of A,, into Eq. (14.2.4) gives 


d} (N? — 1) 
2N? 


where d is the beam diameter at the grating surface. Note that TAS is largest at 29 
and decreases slowly as 4 changes. 

We now illustrate these results with an example using a grating with 
75 grooves/mm and a fused silica prism with y = 2.76°, hence A) = 500 nm. 
Assume an f/8 beam and s = 240mm, hence P ~ 55nm/mm. With these 
parameters we get TTC = 11 um for AA + 100 nm, and image length of 38 um 
at the corrected wavelength. Results from ray traces of a 10-mm-thick grism show 
that the zero-coma wavelength is 510 nm, with a plate factor of 53.5 nm/mm in 
this vicinity. Aberrations from ray traces are in excellent agreement with those 
given by the relations for TTC and TAS. 

If this grism is placed in an f/4 beam at s = 120mm, the plate factor 
P ~ 110nm/mm. With these changes the image length is unchanged and TTC 
is two times larger. Because the beam is faster, aberrations due to the thickness of 
the grism are larger and a ray-trace analysis is necessary to determine whether 
they are significant. 

We noted in our discussion of the nonobjective prism that the nominal surface on 
which the spectra are in focus is tilted by an angle 6 = y(N — 1)/N to the telescope 
focal plane. With e = 1/N, the second surface of the grism is tilted by the same 
angle. Because the grating is on this surface, the perpendicular distance between the 
grating and detector is constant and the plate factor is the same over the field. 

The fit between a tilted detector and the curved image surface of a single 
spectrum is shown in Fig. 15.20 for the grism example already given here, with 
tilt 6 = 0.87°. Comparing this diagram with Fig. 15.18, it is evident that the fit is 
better for the grism mode. A better fit is obtained for the grism example if 0, is 
0.5° smaller, but at the expense of variable P in the dispersion direction. Thus the 
grism corrects the two major defects of the nonobjective grating, but with the 
slight added complication of a tilted detector. 


A = —— E 2N, — 2) N? — ©], (15.6.20) 


TAS = 





[1 —(N? — 1) — 0’, (15.6.21) 
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Fig. 15.20. Tilted, curved image surface for grism mode with P = 55nm/mm. Grism parameters 
are x = 1.89°, y = 2.76°, 6 = 0.87°. Horizontal scale is stretched by 20 times relative to vertical scale. 








In our discussion we have ignored the grism thickness and variation of grism 
index with wavelength. Their effects change some of the preceding results, 


though not significantly, and were included by Bowen and Vaughan in their 
analysis. 


15.6.4. NONOBJECTIVE PRISM-GRATING 
The final nonobjective device considered consists of a separate prism and 


grating, as shown in Fig. 15.21. In this system there are two additional degrees of 
freedom: the separation between the elements and their relative orientations. With 


FS 
y. G 


es 
SS | 


Fig. 15.21. Nonobjective prism-grating. FS, focal surface. 
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this system it is possible to eliminate both coma and astigmatism at one 
wavelength in the desired spectral range. This type of device is then suitable 
for fast beams and has been used, for example, at the f /2.7 prime foci of the 4-m 
telescopes at the Kitt Peak (Tucson, Arizona) and Cerro Tololo Observatories 
(Chile). 

Because of the several degrees of freedom for a prism-grating, we make no 
attempt at a thorough analysis but only give selected results without derivation. 
The aberration coefficients for the prism-grating are easily found by substituting 
the prism and grating coefficients into Eq. (5.6.7), with the results 


A, = I [ov — (1 -F) =) +e- e|, (15.6.22) 





2s3 S2 
1 [mi y(N? — 1) (s, 
= — .6.2 
Ais ae N m (15.6.23) 


where s, and s, are the distances from the prism and grating, respectively, to the 
focal surface. Note that these relations reduce to those for the grism when s; = 5». 
Setting A», to zero gives 


aa aN 
“s o N2-1° 


Because the prism is farther from the focus than the grating in the arrangement in 
Fig. 15.21, coma correction is achieved with a smaller apex angle compared with 
the grism. 

The way in which astigmatism varies is most easily seen by taking the prism 
and grating tilt angles of the grism, hence ¢=1/N and ô= y(N — 1)/N, 
respectively. With these values as starting points, and assuming s, and Ay are 
fixed, Eq. (15.6.22) can be evaluated at 4, for different s,. Assigning the grism 
parameters to the grating and prism, astigmatism is zero when s} = 1.885. If € is 
made smaller, then astigmatism is zero at a smaller value of s,. Ray traces show 
good agreement with the calculated values for an f/8 beam. In faster beams, say, 
f/3, the relations above are a good first approximation, but ray traces are 
necessary to optimize the system and find the best grating orientation to fit the 
curved image surface to the tilted detector. 

The slitless modes already discussed here assume a dispersing element in the 
converging beam ahead of the telescope focal surface. It is also possible to place a 
disperser in the diverging beam behind the focal surface and use a separate 
camera to focus the spectra. Because of the added optical elements this 
arrangement is less efficient than those previously discussed here, but has the 
advantage that the focal ratio of the final beam can be chosen as different from 
that of the telescope. A system of this type with a transmission grating and 
Schmidt camera has been built at the Royal Greenwich Observatory in Great 
Britain. 


(15.6.24) 
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15.7. CONCLUDING REMARKS 


With the exception of the nonobjective mountings, the discussion in this 
chapter has made little mention of the appropriate telescope focus for each 
instrument type. Instruments of smaller size are usually used at the Cassegrain 
focus, while large-beam spectrometers are usually placed at a Nasmyth focus on a 
platform that rotates with the telescope. Fiber-fed spectrometers are generally 
placed on a fixed platform near the telescope. 

Older telescope facilities often included a Coudé room below the telescope, 
with a three-mirror system redirecting the light from a Cassegrain telescope along 
the polar axis. Large low-order diffraction gratings were the principal dispersing 
elements, with Coudé beam diameters as large as 300mm. The dispersed light 
most often was sent to one of several Schmidt cameras. Examples of telescopes 
with such facilities include the Hale S-m and Shane 3-m (see Bowen (1960)). 

Spectrometer designs have become increasingly sophisticated in order to take 
advantage of the gains possible in observing efficiency with large telescopes at 
sites with superb seeing. This is especially true at the camera end with large, fast 
refractive and catadioptric cameras, such as those designed by Harland Epps for 
the Keck and other large telescopes. The changes at the input ends of spectro- 
meters have been almost as dramatic, especially in instruments with hundreds of 
fibers feeding a single entrance slit. These spectrometers will certainly revolu- 
tionize the gathering of spectral information from celestial sources. 
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Chapter 16 Adaptive Optics: An Introduction 


The magnitude limits that can be reached by a ground-based telescope depend 
on many factors, of which the image size of a stellar source due to atmospheric 
effects is an important one. A powerful tool for reducing or eliminating seeing 
effects, and thereby reducing the image size, is adaptive optics. The techniques of 
adaptive optics are those by which telescope optics are adjusted on a rapid time 
scale to compensate for distortions in the wavefront entering a telescope. These 
adjustments are generally applied to a relatively small optical element in an 
optical train following the Cassegrain or Nasmyth focus and continuously 
readjusted on a time scale measured in milliseconds. For a good overview of 
the principles of adaptive optics, see the paper by Beckers and Goad (1987) and a 
review article by Beckers (1993). 

Techniques of adaptive optics are to be distinguished from those of active 
optics for which adjustments are relatively much slower, often on a time scale of 
hours. Adjustments of this latter type are usually made to the primary mirror by 
actuators that adjust the shape of the mirror. The goal of active optics corrections 
is usually to reduce the aberrations of a telescope. 

In this chapter we discuss the basic principles of adaptive optics and consider 
techniques by which a distorted wavefront is corrected before detection. It is also 
possible to apply postdetection corrections to images and achieve high angular 
resolution on bright sources with ground-based telescopes. For information on 
such techniques, and the fields of speckle imaging and speckle interferometry, the 
reader should consult the reference by McAlister (1985). 
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We describe some of the effects of the atmosphere on images, including 
selected relations based on the theory of atmospheric turbulence. The relations 
given by this theory are not derived; for thorough discussions of the theory of 
turbulence applied to optical astronomy the reader should consult the references 
by Roddier (1981) and Coulman (1985). We consider the effects of turbulence 
from the point of view of the time-averaged modulation transfer function (MTF) 
of the atmosphere. 


16.1. EFFECTS OF ATMOSPHERIC TURBULENCE 


The most notable effect of a turbulent atmosphere is a blurred image in the 
focal plane of a telescope. For a large telescope the image size, often called the 
seeing disk, is usually larger than the diffraction disk. The angular radius of the 
Airy disk and the limit of resolution «,, from Eq. (10.2.9), can be written as 


a, (arc-sec) = 0.25 A(um)/D(m). (16.1.1) 


If the radius « of the seeing disk is 0.5 arcsec and A = 0.5 um, we get « = & fora 
25-cm telescope. Thus for visible wavelengths and 1-m or larger telescopes, the 
image size is determined by atmospheric effects. Because «, « A/D the seeing 
disk may be comparable to the diffraction disk at infrared wavelengths for large 
telescopes, especially at the longest wavelengths that reach the ground. 


16.1.a. SEEING AND SCINTILLATION 


The effect of atmospheric turbulence on stellar images is usually separated into 
two distinct phenomena. Seeing is the term used to describe random changes in 
the direction of light entering a telescope, while scintillation refers to random 
fluctuations in the intensity. Both of these effects arise from variations in the 
index of refraction and give rise to a distorted wavefront. A cross section of such 
a wavefront reaching the ground at a given instant of time is shown schematically 
in Fig. 16.1. 

We first describe the effects of seeing and scintillation on a stellar image as 
observed with the eye. Scintillation is most evident to the unaided eye as the 
phenomenon called “twinkling.” In a telescope the twinkling is usually not seen, 
and a photometer is needed to record the fluctuations in intensity. In general, the 
larger the aperture the smaller are the fractional changes in the intensity. 

The effect of seeing is a function of the telescope aperture. In good seeing, 
with a 10-cm aperture or less, the Airy disk of a star moves randomly about its 
mean position in the focal plane with excursions of 1 or 2 arc-seconds. In a large 
telescope, 1 m or larger, a blurred image is seen with little or no motion of the 
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Fig. 16.1. Cross sections of undistorted wavefront 2, at top of atmosphere and distorted 
wavefront Z, at ground, after passage through turbulent atmosphere. 


image as a whole. If the eye could follow the rapid changes within the image, it 
would see a changing pattern of speckles, each speckle having a size comparable 
to an Airy disk. A given speckle pattern is stationary over times on the order of 
10-50 msec, with two patterns similar only for point sources within about 10 arc- 
sec of one another. 

From these observations we deduce that the curvature of the wavefront is 
negligible over distances of the order of 10cm, with instantaneous slopes of 1 or 
2 arc-sec from an undistorted wavefront. The image seen in a large telescope is 
thus the average over many sections of the wavefront, each with a different 
instantaneous slope. 


16.1.6. MODULATION TRANSFER FUNCTIONS 


The demonstration that wavefront distortions arise from variations in index of 
refraction was discussed in Chapter 3 from the point of view of Fermat’s 
Principle. This approach was adequate for showing the origin of seeing, but a 
more fruitful approach is one based on a theory of atmospheric turbulence. We 
now present selected results derived from a statistical approach, with results taken 
from Roddier (1981). 

We first consider the image of a point source that has been broadened by 
seeing to a width large compared to the diffraction width. An approximate form 
for the distribution of energy within such an image is a Gaussian, with the 
normalized intensity given by 


ila) = exp (—8?/20°), (16.1.2) 
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where « is the angular distance from the image peak and o’ is the rms deviation in 
a given direction from the peak. To find the MTF we substitute Eq. (16.1.2) into 
Eq. (11.1.8), adjust the normalization factor to give T(0) = 1, and get 


T,(v) = exp (—277a7v’). (16.1.3) 


If the unit of g’ is arc-seconds, the unit of v is cycles per arc-sec. Note the 
correspondence between these relations for i(x) and T(v) with Eqs. (11.1.17) and 
(11.1.18). Equation (11.1.18) gives the pointing degradation function, and its 
product with the telescope MTF is the system MTF in the presence of pointing 
error. Equation (16.1.3) can be taken as the system MTF provided the telescope 
MTF is essentially unity over the range where 7(v) is effectively nonzero. 

As an illustration, we choose o’ to give an image whose FWHM is 0.5 arc-sec. 
Setting i(~) = 0.5 with w = 0.25 arc-sec in Eq. (16.1.2) gives o’ = 0.212 arc-sec. 
Substituting this value of o’ in Eq. (16.1.3), we find T = 0.029 for v= 
2 cycle/arc-sec. This is the effective cutoff frequency and is small compared to 
the diffraction cutoff frequency D/A for a large telescope. For an 8-m telescope at 
A= 500nm we get v, = D/A = 77.6 cycles/arc-sec, and in this case we are 
justified in taking Eq. (16.1.3) as the system MTF. 

Following the procedure in Section 11.1 we rewrite Eq. (16.1.3) in terms of the 
normalized spatial frequency. The resulting atmospheric degradation function is 


T,(v,) = exp (—2770°v2), (16.1.4) 


where o = a'(D/A), v,, = v(A/D), and the subscript a denotes the atmosphere. 
Figure 16.2 shows T, from Eq. (16.1.4) for D = 8m and o’ = 0.212 arc-sec, at 
A = 0.5 and 2 um, superposed on the MTF for a perfect telescope with no central 
obscuration. 

Although Eq. (16.1.3) is a reasonable approximation to the MTF of a large 
ground-based telescope, it is an ad hoc relation based solely on a statistical 
approach. This relation does not, for example, indicate how the rms seeing value 
might depend on wavelength or zenith angle. 

An approach based on the physics of the so-called Kolmogorov turbulence 
gives an MTF that leads to a better description of the observed image profile. A 
detailed discussion of this approach and the resulting MTF and degradation 
function, both as given by Roddier (1981), are 


l m” 
Tx(v) = exp -344(2) } (16.1.5) 
0 


D \33 
Tx(v,) = epl -344(2>,) | (16.1.6) 
0 
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Fig. 16.2. Degradation function 7,(v,) for atmospheric turbulence according to Eq. (16.1.4) 


superposed on MTF for perfect telescope. Perfect MTF (solid line); 4 = 0.5 um (dashed line); 
À = 2 um (dotted line). See text following Eq. (16.1.4) for values of the other parameters. 


where v is the angular frequency and rg is a wavelength-dependent length that is a 
measure of the seeing quality. As noted by Roddier, these T(v) are appropriate for 
a long-exposure image. 

The parameter rọ is defined such that the angular resolution of a telescope 
is set by the atmosphere when D > rg, and set by the telescope when D < rọ. 
For a large telescope limited by seeing, the limiting angular resolution for an 
unobstructed circular aperture is approximately 


xo S 1.22(A/r9). (16.1.7) 
The specific form of rọ is given by 
ro = 0.1854% (cos y5 E35, (16.1.8) 


where y is the zenith angle and È is a function integrated through the atmosphere 
that is a measure of the turbulence. The reader should consult the reference by 
Roddier cited here for details on the function 2. 

It is evident from Eq. (16.1.8) that rọ increases with increasing wavelength and 
decreases with increasing zenith angle. At y = 45°, rọ is approximately 20% 
smaller than at the zenith. In the examples to follow, we assume y is zero. 

Figure 16.3 shows Tg(v) from Eq. (16.1.5) for 2 = 0.5 um and ry = 26cm, 
with the latter value chosen to give a = 0.5 arc-sec. Also shown in Fig. 16.3 for 
comparison is 7,(v) from Eq. (16.1.3) with the FWHM = 0.5 arc-sec. 

The profile of a point spread function (PSF) is the Fourier transform of the 
corresponding T(v), as discussed in Section 11.1. For the long-exposure MTF in 
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Fig. 16.3. Modulation transfer functions Tg(v) (solid line) at 4 = 0.5 um and ry = 26cm and 


T,(v) (dashed line) with FWHM = 0.5 arc-sec. The curves are computed for an 8-m telescope with 
Eqs. (16.1.5) and (16.1.3), respectively. 


Eqs. (16.1.5) or (16.1.6) we use Eq. (11.1.9) or (11.1.12), respectively, to find the 
PSF profile. Substituting Eq. (16.1.6) into Eq. (11.1.12) we get the PSFs at 
4 = 0.5 and 1 um shown in Fig. 16.4. The parameters used are the same as those 
in Fig. 16.3. Because Tg(v) is approximately a Gaussian function, so also is its 
PSF. A comparison of the profiles in Fig. 16.4 with those found using the 
Gaussian MTF in Eq. (16.1.4) shows similar image cores, but a more rapid 
decrease in the wings of the Gaussian PSF. 
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Fig. 16.4. Point spread functions for 2-m telescope at 4 = 0.5 um (dashed line) and 1 um (solid 
line) for Tg(v) from Fig. 16.3. 
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Table 16.1 


Angular Resolution for Constant Turbulence” 


A(um) ro(cm) Op {arc-sec) a, (arc-sec) 
0.5 26 0.50 0.016 
2.2 150 0.37 0.07 


10.0 940 0.28 0.31 


"X in Eq. (16.1.8) is held constant. 


An important feature of note in Fig. 16.4 is the Strehl ratio S, the normalized 
intensity at the peak of each PSF. A diffraction-limited PSF has S > 0.80, hence 
the effect of the turbulence used to find the PSFs in Fig. 16.4 is to dramatically 
decrease the Strehl ratio. This decrease, in turn, affects the limiting magnitude 
and provides an incentive to correct for wavefront distortion. 

Table 16.1 gives values of rọ, %), and a, for several wavelengths, assuming the 
same constant È in Eq. (16.1.8), hence same turbulence, for each. The values of 
a, are calculated from Eq. (16.1.1) assuming an 8-m telescope. The improvement 
in resclution with increasing wavelength is evident from these results. Note also 
that diffraction, negligible in the visible compared to seeing, grows in significance 
in the infrared and is larger than the turbulence effect at 4 = 10 um. 

All of the preceding results are based on the long-exposure MTF given in Eq. 
(16.1.5). Roddier (1981) also gives an expression for a short-exposure MTF, and 
the reader should consult this reference for details. 


16.2. CORRECTION OF WAVEFRONT DISTORTION 


From our results in the previous section, especially Fig. 16.4, it is clear that 
atmospheric turbulence has a drastic negative effect on angular resolution and 
limiting magnitude of ground-based telescopes. Given that the seeing parameter 
ro X 125/5, these negative effects are larger for shorter wavelengths and it not 
surprising that the techniques of adaptive optics were first applied at infrared 
wavelengths. The ease of correction depends strongly on the size of rọ, hence on 
the wavelength, as examples in this section demonstrate. 


16.2.a. SOME BASIC RELATIONS 


In order to illustrate the requirements for adaptive optics systems, we first 
present some of the necessary relations needed in our discussion. In this section, 
we draw on an excellent review by Beckers (1993). 
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Detection and compensation of the phase variations on the wavefront are 
usually done by measuring the wavefront of a reference object near the target 
object. This method succeeds if the angular separation between these two objects 
is less than the isoplanatic angle 6). A good approximation to this angle is 


Oa = 0.3(r/H), (16.2.1) 


where H is the average distance of the turbulent layer. This angle corresponds to a 
lateral shift of 0.3r between wavefronts from sources separated by 6, hence the 
overlap in common area between the wavefronts is approximately 60%. 

At separations of 6) the rms difference between the reference and target 
wavefronts is 4/6. For rọ = 26cm, from the first line in Table 16.1, and 
H = 5km, we find 6) = 3.4 arc-sec. In the visual range only a small fraction of 
the desired targets have suitable reference objects within the isoplanatic angle. 
This has led to development of laser guide stars, a topic we comment on briefly in 
the following section. The situation for natural reference objects in the infrared is 
decidedly more favorable. 

Another angle related to 0 is the isoplanatic angle for image motion 0„. This 
is the angular distance over which image motions are very similar. An approx- 
imate relation for this angle is 6,, = 0.3(D/H) = @)(D/'ro). 

Another factor of crucial importance in applying the techniques of adaptive 
optics to correct for the phase variations is the rate at which the wavefront 
changes. This rate depends on wind velocities at different heights in the atmo- 
sphere. An approximate time scale for significant change is 


to © 0.3(r9/ Ving) (16.2.2) 


For rọ = 26cm and Vwing = 10 m/sec, we find tọ = 0.008 sec. It is again evident 
that the situation for detection and compensation of phase variations is more 
favorable in the infrared than in the visual range. 


16.2.b. PARTIAL WAVEFRONT CORRECTION 


The parameter ro, formally called the Fried parameter, can be thought of as the 
maximum diameter of a telescope whose performance is not seriously limited by 
atmospheric turbulence. Thus rọ is effectively the diameter of an undistorted 
seeing cell with the number of cells over a telescope aperture ~ (D/ r ; 
Correction of a wavefront requires sensors to detect the deviations of an actual 
wavefront from an ideal plane wavefront and act on a deformable mirror to 
correct in real time. For a telescope of diameter D, the number of sensors required 
to approach correction giving diffraction-limited performance equals the number 
of seeing cells over the aperture. With rọ ~ 10-20 cm for visible wavelengths, it 
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is evident that reaching near complete correction with a large telescope is very 
difficult, at best, in the visual range and challenging in the infrared. 

The phase or path differences of a wavefront entering a telescope relative to a 
plane wavefront can be characterized in terms of any set of orthogonal functions, 
although Zernike polynomials are those most often used. Beckers states that the 
uncorrected rms phase variation across a circular aperture for variations caused by 
Kolmogorov turbulence is 


wp (waves) = 0.162(D/rp)’®. (16.2.3) 


Note that setting rọ = D gives œ = 1/6. This is not diffraction-limited perfor- 
mance as defined in Chapter 10, but the Airy peak is still prominent. 

If the lowest Zernike modes, those of tip-tilt (x- and y-tilt in Table 10.5), are 
compensated, then the residual mean square wavefront error is 13% of that for the 
uncompensated wavefront. If the next three Zernike modes are removed, focus 
and astigmatism in Table 10.5, the residual mean square error is 6%. The 
corresponding rms wavefront errors remaining for these two cases are 


(Wy (waves) = 0.058(D/ro)”6, (16.2.4) 


ws (waves) = 0.040(D/r)°’®, (16.2.5) 


where the subscript on œw denotes the total number of compensated modes. For a 
large number of compensated modes, the rms wavefront error remaining is 


w; (waves) = 0.086 j~*/!2(D/r,)*°. (16.2.6) 
J 0 


A comparison of Eqs. (16.2.3) and (16.2.4) shows that the dominant effect of 
turbulence on the wavefront entering a telescope is tilt of the entire wavefront. 

It is instructive to take Eqs. (16.2.3)}16.2.5) and find D/rọ that corresponds 
to the diffraction limit of ~//14 or 0.071 waves. Applying this to Eq. (16.2.4), 
for example, gives D/ry = 1.28. Thus a relatively simple tip-tilt device sufficient 
to achieve diffraction-limited performance in the far infrared would not be able to 
do the same in the visual range. From Eq. (16.2.6) we find that the number of 
compensated modes needed to reach the diffraction limit is ~(D/roy i 

It is also important to note that partial compensation of atmospheric turbulence 
leads to a larger Strehl ratio S and fainter limiting magnitude. For the two cases 
corresponding to Eqs. (16.2.4) and (16.2.5), Roddier states that the maximum 
improvement factors for S are 5 and 10, respectively. 

The impact of atmospheric turbulence on the Strehl ratio is evident from the 
PSFs in Fig. 16.4. Following the procedure in Section 16.1 for computing PSFs 
using Eqs. (16.1.6) and (11.1.12), we find values of S for a range of D/rp. The 
results are shown in Fig. 16.5 as a function of wavefront error given by Eq. 
(16.2.3). Also shown in Fig. 16.5 is a line for the ratio (rg /DY plotted versus wọ. 
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Fig. 16.5. Strehl ratios for PSFs blurred by atmospheric turbulence as a function of the 
uncorrected rms wavefront error. The solid line is (r9/DY. See text for discussion. 


For large wavefront errors the average intensity over a blurred image and the 
normalized peak intensity are proportional to this ratio, the area of the Airy disk 
divided by the area of the blurred image. 

Partial correction of the wavefront gives an increased S and the emergence of a 
peak whose FHWM is ~ A/D. These changes in the PSF are easily illustrated by 
taking the MTF degradation factor for midfrequency statistical error given in Eq. 
(11.1.15). The correlation length / in c(v,,) is a measure of the structure on a 
wavefront and, for atmospheric turbulence, we take / = (r)/D). With this change 
in Eq. (11.1.15), we get MTF degradation factor for partial correction 


Te = exp {—k [1 — c(v,)]}, (16.2.7) 


where k = 27/4, w is the remaining rms wavefront error, and c(v,,) is modeled as 
a Gaussian in the form 


c(v, = exp (—4v?(D/rp)’). (16.2.8) 


As an example, we take the PSF for 2 = 1 um in Fig. 16.4, with D = 2m and 
& = 0.5 arc-sec as the remaining parameters. From Eqs. (16.1.7) and (16.2.3) we 
get ro = 0.524m and œ = 0.50 waves. Choosing w = 0.2 waves, we find 7,, 
from Eqs. (16.2.7) and (16.2.8), multiply by T; for a perfect system, and use the 
resulting 7 in Eq. (11.1.12) to find the PSF. 

The MTFs for this example are shown in Fig. 16.6, including the uncorrected 
MTF degradation factor from Eq. (16.1.6). Note that the MTF for the partially 
corrected system has an extended tail not present in the MTF for the uncorrected 
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Fig. 16.6. The MTF for a partially corrected wavefront (solid line) with œ = 0.2 waves according 
to Eq. (16.2.7). Perfect MTF (dotted line); uncorrected MTF (dashed line). See text following Eq. 
(16.2.8) for values of the other parameters. 


system. The PSFs for these MTFs are shown in Fig. 16.7. Note the presence of a 
diffraction-limited core whose FWHM = 0.12 arc-sec and an increase in S' of a 
factor of 4.4. In effect, the partial correction has moved some of the energy from 
the inner part of the blurred image into a sharp core and left the outer parts 
essentially unchanged. 
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Fig. 16.7. The PSFs at 4 = 1 um for partially corrected MTF (solid line) and uncorrected MTF 
(dashed line). The MTFs are shown in Fig. 16.6. 
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Fig. 16.8. The MTF for a partially corrected wavefront (solid line) with w = 0.1 waves according 
to Eq. (16.2.7). Perfect MTF (dotted line); uncorrected MTF (dashed line). Other parameters are the 
same as those for Fig. 16.6. 


Following the same procedure, but choosing w = 0.1 waves, we get the results 
shown in Fig. 16.8 for the MTF and Fig. 16.9 for the PSF. With this choice of œw 
the system is approaching the diffraction limit at which S = 0.80. A semilog plot 
of the PSF with w = 0.1 waves shows the presence of several bright Airy rings 
around the main Airy peak. 
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Fig. 16.9. The PSFs at 4 = 1 um for partially corrected MTF (solid line) and uncorrected MTF 
(dashed line). The MTFs are shown in Fig. 16.8. 
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This discussion of the effects of atmospheric turbulence on image character- 
istics and their partial correction is only an overview of a large subject area, but 
should suffice to give the reader an idea of the basic concepts. For a thorough 
discussion of the many facets of adaptive optics, consult the text by Tyson (1991). 


16.3. ADAPTIVE OPTICS: SYSTEMS AND COMPONENTS 


From our preceding discussion we see that the requirement for an adaptive 
optics system is a means of measuring the phase variations over an incoming 
wavefront and using that information to effectively flatten the wavefront in real 
time. In this section we consider some of the basics of the optical elements used 
in adaptive optics systems. 


16.3.4. AN OVERVIEW 


A schematic diagram illustrating the essential components of an adaptive 
optics system is shown in Fig. 16.10. Light from the images of a target object and 
a reference source at an angular separation less than the isoplanatic angle 0 is 
collimated by a mirror or lens. The collimator forms an image of the entrance 
pupil on a deformable mirror DM and a second optical system following a tip-tilt 
mirror TTM reimages the target and reference objects. For a system making tip- 
tilt corrections only, the deformable mirror is not a part of the optical train. 

Light from the target is sent to a reimaging camera or to a spectrometer, while 
the light from the reference is sent to a wavefront sensor. All of the optics in the 
reference beam is configured to reimage the pupil on the wavefront sensor. The 











Fig. 16.10. Schematic of an adaptive optics system. F, telescope focus; Col, collimator; DM, 
deformable mirror; TT, Tip-tilt mirror, Cam, camera; WS, wavefront sensor; C, computer. 
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computer-processed signal from the wavefront sensor is then fed back to DM and 
TTM to compensate for the detected phase errors in the beam at the sensor. Thus 
the target beam is reflected from two mirrors that have been adjusted to correct 
for the detected atmospheric turbulence. 

The number of sensor elements needed to correctly sample the distortions of 
the incoming beam and achieve near diffraction-limited performance is ~(D/r9y 
and the rate at which the sampling is done is governed by tọ in Eq. (16.2.2). 
Sampling at time intervals longer than tọ results in degraded correction. Other 
important considerations include the number of detected photons within one 
sampling time and the SNR for this sample. 


16.3.b. ADAPTIVE MIRRORS 


A deformable mirror is generally one of two types, a segmented mirror or a 
mirror with a continuous faceplate. Each mirror in a segmented mirror is adjusted 
separately in tip-tilt and piston (in an axial direction). Tip-tilt adjustments are 
controlled by the wavefront sensor, but axial adjustments must be controlled 
separately to ensure approximate continuity of the overall surface. The desired 
number of separate mirrors is ~(D/r9)’, each with its own set of actuators. 

Continuous faceplate mirrors are usually driven by bidirectional actuators 
using piezoelectric materials, with fewer actuators required because continuity is 
automatically ensured. This type of mirror also avoids the problems of light loss 
and diffraction caused by the gaps between the elements of a segmented mirror. 
Continuous mirrors are the usual choice for adaptive optics systems intended for 
faint sources. 


16.3.c. WAVEFRONT SENSORS 


A commonly used wavefront sensor is the Hartmann-Shack sensor shown 
schematically in cross-section in Fig. 16.11. An image of the telescope pupil is 
located on an array of lenslets, each of which reimages a portion of the incident 
wavefront on a detector array. For a fully adaptive system the size of each lenslet 
is approximately rọ. The image from each lenslet is shifted if the part of the 
wavefront being reimaged is tilted, with the magnitude of the shift at the detector 
proportional to the tilt angle and the direction of the shift giving the azimuth of 
this angle. 

The composite of these numbers gives the information needed to find the 
overall tilt of the entire wavefront in azimuth and angle, while the separate shifts 
give similar information on each portion of the wavefront. The former is fed to 
the tip-tilt mirror and, for a fully adaptive systrem, the latter to the deformable 
mirror. 
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Fig. 16.11. Schematic cross section of a Hartmann-Shack wavefront sensor. The FL, field lens; 
PP, pupil plane; D, detector. Edge-rays are shown for one lenslet. 


Advantages of a sensor of this type include simplicity, compactness, and 
sensitivity over a broad wavelength range with nearly all of the light from the 
reference object sent to the detector. The CCD arrays with low read-out noise and 
high quantum efficiency are the detector of choice for visible and near infrared 
wavelengths. Beckers gives a table of limiting magnitudes per lenslet for different 
spectral bands and a set of assumptions about seeing conditions and the detector. 
For the R-band this limit is about V = 10. The reader should consult the review by 
Beckers (1993) for details. 


16.4. CONCLUDING REMARKS 


In this chapter we have provided some of the basics needed to understand the 
importance of the field of adaptive optics. We have not discussed the variety of 
wavefront sensors used in these systems or the details of adaptive mirrors. Nor 
have we discussed in detail the important area of laser guide stars. Here we give a 
brief look at the impetus behind the work in this latter area, but leave the specifics 
to specialists in the field. 

The principal limitation in applying adaptive optics is the lack of natural 
reference stars of sufficient brightness near any given target object. The small size 
of the isoplanatic angle in the visible range, typically 1 or 2 arc-sec, translates into 
the likelihood of finding a suitable reference star of roughly 1 in 10,000 at an 
average galactic latitude. For infrared wavelengths this limit is less severe, with a 
likelihood of about 1% at 2.2 um and near unity at a wavelength of 10 um. This 
limitation can be largely removed by using an artificial guide star provided by the 
reemission of light from sodium atoms in a layer in the upper atmosphere 
illuminated by a sodium laser beam. A good introduction to this rapidly 
developing field is given by Beckers (1993). 
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Our final comments are directed toward the applications of adaptive optics, 
especially in an era of 8- to 10-m telescopes. The advantages for imaging are 
obvious when the difference between diffraction-limited and seeing-limited 
resolution is large. The diffraction limit from Eq. (16.1.1) is approximately 
0.01 arc-sec for an 8-m telescope, or about | parsec at the distance of the Virgo 
cluster. Better resolution is also accompanied by a fainter limiting magnitude for 
point sources, a topic we discuss further in the Chapter 17. 

The advantages of adaptive optics for spectrometry are equally impressive. 
Our discussion in Chapter 15, especially that based on Eq. (15.2.1), shows the 
need for larger spectrometers if the resolving power-slit width product is to be 
maintained with larger telescopes. In the diffraction limit we substitute 2 for dD 
to get the limiting spectral resolving power, as pointed out in Section 12.5. To 
ensure that most of the light in a diffraction-limited image passes through the 
aperture of a spectrometer, a more practical substitution is 2A for @D. By 
replacing @D with 2A in Eqs. (15.2.1) and (15.2.3), we find that the dependence 
on the telescope diameter vanishes. As pointed out in an example in Section 12.5, 
this also means that the need for fast cameras also vanishes. 

We leave it to the reader to further explore the advantages of adaptive optics. 
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Chapter 17 Detectors, Signal-to-Noise, and 


Detection Limits 


Given a particular telescope and instrument combination, it is essential to 
know its capabilities for making a specific type of observation. To determine 
these capabilities requires not only a knowledge of the telescope and instrument 
characteristics, but also those of other parts of the overall system. The system as 
used here includes the detector and, in the case of ground-based telescopes, the 
atmosphere. The ability of a given system to measure a given signal is then 
determined by including all of these factors in a system analysis. 

In this chapter we discuss those characteristics of detectors that are important 
for the detection and resolution of point sources. We consider a detector in terms 
of its modulation transfer function (MTF) and the effect this has on resolution of 
images. We discuss the characteristics of some types of solid-state detectors and 
use these results in examples of limiting magnitude calculations. We do not 
discuss the physics of detectors. 

In most cases the output of a detector includes both the desired signal from the 
source being observed and “noise” from unwanted sources. This noise can arise 
from many different sources, such as light from the sky background in the 
vicinity of the object under study, and noise intrinsic to the detector in the form of 
dark current for a photomultiplier or dark counts for a photon-counting detector. 
For solid-state detectors there is also read noise introduced during the process of 
reading out the accumulated information stored on the array of pixels. 
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In the absence of sky background and detector noise, the recorded signal 
shows fluctuations due to “photon noise.” The source of this noise arises from 
statistical fluctuations in the number of recorded photons about some average 
value, where the average number of detected photons is found from a large 
number of identical exposures. Analysis of the effect of any noise contributor is 
done in terms of SNR, the topic of one of the sections in this chapter. 

The final section of this chapter is a discussion of the detection limits in the 
presence of noise for different types of observations. Included are relations 
appropriate to direct imaging, and spectrometric observations for both slit-limited 
and slitless modes. Examples of selected modes are given for both ground-based 
telescopes and HST. 


17.1. DETECTOR CHARACTERISTICS 


With few exceptions, the detectors used for imaging and spectroscopy in the 
visible and infrared are solid-state devices, such as the charge-coupled device 
(CCD), based on the properties of semiconductors. The details of the physics of 
these devices can be found in monographs on detectors; characteristics of typical 
large solid-state detectors of interest in this chapter are given in Table 17.1. 

For our purposes the important detector parameters are pixel size, quantum 
efficiency, intrinsic noise, MTF, and Nyquist sampling. We discuss each of these 
in turn in this section. 


17.1.a. PIXELS, QUANTUM EFFICIENCY, AND INTRINSIC NOISE 


A pixel or picture element is a single detector element in an array of elements, 
for example, as a single diode on a detector such as those in Table 17.1. Pixel 


Table 17.1 


Nominal Characteristics of Large Solid-State Arrays 


CCD HgCdTe InSb 

Format 2048 x 4096 1024 x 1024 1024 x 1024 
Pixel (um) 15x 15 20 x 20 27 x 27 
Size (mm) 30 x 60 20 x 20 30 x 30 
Dark/Pixel 1 e~ /hour 0.1 e7 /sec 0.1 e7 /sec 
Quantum Efficiency 

400-700 nm>70% 1-2 um ~ 70% 1-5 um ~ 80% 
Longwave Limit ~ 1 yum ~ 2.5 um ~ 5.5 um 
Read Noise (e7 rms) 4 20 25 


Full Well 200,000 100,000 300,000 
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sizes are typically in the range 10—30 um. Matching pixel size to image or 
spectral resolution is an important consideration in the design of most systems, as 
has been noted on numerous occasions in earlier chapters. 

If n; is the number of photons per sec incident on a detector and np is the 
number of detected photons, the quantum efficiency Q is defined as Q = ng/n;. 
For a sensitive photographic plate Q is a few percent at best, while for a CCD 
detector the quantum efficiency approaches unity for red wavelengths. Given the 
rapid development of CCD technology, especially in area and number of pixels, 
this detector is now the choice for nearly all spectrometric and direct imaging 
systems in the visible and near infrared. 

The principal noise contribution in the output of most electronic detectors is 
the dark current or, equivalently, the dark count, or number of electrons per sec 
per pixel generated in the absence of an input light signal. The size of the dark 
count is reduced by cooling the detector and can often be reduced to a negligible 
level. For detectors such as those in Table 17.1 an additional noise contributor is 
the read noise, usually given as some number of equivalent electrons rms. These 
noise factors are included in the discussion of SNR and relations for detection 
limits in later sections. 

For more details on particular detectors the reader should consult data sheets 
prepared by manufacturers and references devoted to discussions of detectors. 


17.1.6. MODULATION TRANSFER FUNCTION OF IDEAL PIXEL 


The detection characteristics of a pixel array are determined in large part by 
the pixel size, which, in turn, can be described in terms of a detector MTF. An 
ideal detector is one for which the counts from a given pixel are completely 
independent of those from neighboring pixels, hence an ideal pixel can be 
represented as a rectangular well. 

Consider such a pixel of dimensions a and b in the x- and y-directions, 
respectively. We derive an MTF by defining this rectangular aperture as a PSF 
according to 


i(x, y) = 1, |x] <a/2 and |p| < 5/2, 


(17.1.1) 
= 0, |x|} > a/2 or |y| > 5/2, 


where the pixel center is at x = y = 0. Substituting Eq. (17.1.1) into Eq. (11.1.6) 
gives the normalized MTF as 


T,(v,, Vy) = sinc (nva) sinc (zv,b), (17.1.2) 


where sinc(z) = (sinz)/z. The spatial frequencies in Eq. (17.1.2) are defined 
following Eq. (11.1.6), where y is the orientation of the input sine target, 
discussed in Section 11.1, relative to the x, y coordinate system. If the lines of 
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the sine target are parallel to x or y, one sinc function in Eq. (17.1.2) becomes 
one. 
Assuming square pixels of side A and v, = v, Eq. (17.1.2) becomes 


T,(v) = sinc (vA), 


17.1.3 
Ta(v,) = sine (xv,), a 


where we define a detector normalized frequency v, = vA. Note that the MTF and 
normalized detector frequency are independent of wavelength. In the discussion 
to follow we take Eq. (17.1.3) as the representation of an ideal pixel array. 

The system MTF, including detector, is the product 7(v)T,(v), where T(v) 
includes the factors in Eq. (11.1.14) for a system with aberrations. Following the 
procedure in Chapter 11, we rewrite this product in terms of the normalized 
spatial frequency of the system. Denoting the overall system MTF by T, we get 


T,(v,) = T(v,) sine | (F) = T(v,) sine (=), (17.1.4) 
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Fig. 17.1. Pixel MTF for different values of N based on Eq. (17.1.4). Here N = AF /A, where A is 
the pixel size. 
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Fig. 17.2. System MTF including detector for circular aperture with central obscuration. Detector 
MTF is shown in Fig. 17.1. 


where N = AF /A = number of pixels per length AF. From Eq. (10.2.9) we see 
that AF is the approximate radius of the Airy disk. 

Figure 17.1 shows pixel MTFs for three different values of N. Note that the 
MTF is negative for some v, < 1 when N < 1. System MTFs for a perfect system 
with a circular aperture and ¢ = 0.33 are shown in Fig. 17.2 for each detector 
MTF in Fig. 17.1. The effect of the detector MTF is evident by inspection of the 
curves in Fig. 17.2. For N = 2 there is little change of the MTF due to the optics 
only, with progressively greater change as N decreases. Note that high-frequency 
information is lost or masked for N = 0.5. This is not surprising, given that a 
single pixel spans an Airy disk diameter in this case. 


17.1.c. MODULATION TRANSFER FUNCTION AND NYQUIST 


SAMPLING 


According to the Nyquist criterion for discrete sampling, two samples per 
resolution element are required for unambiguous resolution of images that are 
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just resolved according to the Rayleigh criterion. With a linear separation at the 
Rayleigh limit given by AF, the Nyquist criterion is satisfied with a pixel size 
A = AF /2, hence N = 2. Strictly speaking, this criterion applies to a sampling in 
one direction, but it is also appropriate for panoramic sampling. 

Given N = AF’/A, the curves in Fig. 17.1 change in proportion to wavelength 
with F/A held constant. If, for example, the curve for N = 2 is appropriate for 
wavelength A, then the N = | curve is appropriate for 2/2 with the same pixel 
size and focal ratio. In this case, therefore, an image that is correctly sampled at 
longer wavelengths is not adequately sampled at shorter wavelengths. 

We now apply these results to selected HST cameras assuming: (1) a perfect 
telescope; and (2) a pixel MTF of the form given in Eq. (17.1.4). The first 
assumption is not far from correct for visible and longer wavelengths but is a poor 
assumption in the far ultraviolet. The second assumption presumes an ideal pixel 
and may not be strictly valid for a real detector. Nevertheless, we proceed with 
these assumptions to illustrate the general characteristics of HST cameras in 
selected modes. 

Listed in Table 17.2 are the focal ratios of the direct imaging modes of the 
advanced camera for surveys (ACS) in the visible and near infrared and the near- 
infrared camera and multiobject spectrometer (NICMOS). Also given in Table 
17.2 is the width of a pixel projected on the sky for each mode. Note that the pixel 
size of the f /17.2 wide-field mode of NICMOS is larger than the FWHM, given 
in Fig. 11.15, for all wavelengths to which it is sensitive. For the wide-field mode 
of ACS we see that the Nyquist criterion is nearly satisfied at a wavelength of 
l pm, the longwave limit of the CCD. The Nyquist criterion is satisfied for the 
Jf /80 camera of NICMOS at 1 um, but in this case the range of sensitivity extends 
to about 2.5 um. 

Taking the entries for F/A in Table 17.2, we get the lines representing N for 
each of the camera modes shown in Fig. 17.3. For each mode the range of 


Table 17.2 


Selected Characteristics of HST Cameras? 








ACS NICMOS 
F A(arc-sec) F/A’ F A(arc-sec) F/A 
26 0.050 1.73 17.2 0.20 0.43 
22 0.025 3.43 45 0.076 1.13 
80 0.043 2.00 


7 ACS: A=1S5pm for f/26 mode (Wide Field Camera); 
A=21lum for f/72 mode (High-resolution Camera); and 
NICMOS: A = 40pm. 

b The unit of F/A is ym™!. 
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Fig. 17.3. Number of pixels spanning length AF for selected camera modes of the Hubble Space 


Telescope. See Table 17.2 for values of F/A. ACS: f/72 (dash-dot-dot-dot), f /26 (solid); NICMOS: 
f/80 (long dash), f/45 (dash-dot), f /17.2 (short dash). 


wavelengths for which N > 2 is the range in which the image of a point source is 
oversampled. When N = 2 there are approximately twelve pixels covering the 
Airy disk. 

It is important to note that detection of a single point source does not require a 
large number of pixels covering the Airy disk. For the f /26 mode of the WFC of 
ACS, for example, most of the energy of a single star image in the blue and near 
ultraviolet is recorded on four pixels, sufficient for many types of observations. 


17.1.d. APPROXIMATE PIXEL MODULATION TRANSFER FUNCTION 


An approximate relation often used to represent the MTF of a square pixel is a 
Gaussian profile of the form 


T,(v) = exp [—0.282(xvA)’], (17.1.5) 


where the constant 0.282 is chosen to make JT, = 0.5 at v = 1/2A. Figure 17.4 
shows T, for a sinc function and Gaussian, the former according to Eq. (17.1.3) 
and the latter from Eq. (17.1.5). If curves like those in Fig. 17.2 are generated 
using Eq. (17.1.5) rather than Eq. (17.1.3), the resulting system MTFs are little 
different except that the curve for N = 0.5 does not go negative. The general 
comments made about Fig. 17.2 are also true for the modified MTFs. Because the 
MTF and PSF are a Fourier transform pair, the PSF for Eq. (17.1.5) is also a 
Gaussian. 
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Fig. 17.4. Plot of sinc and Gaussian representations of pixel MTF. See Eq. (17.1.5) and the 
following discussion. 


17.1.e. IMAGE SHARPNESS AND SAMPLING 


The importance of Nyquist sampling is shown by an application involving 
finding the best focus of a camera or telescope. If an image is diffraction-limited 
or nearly so, then the following sharpness algorithm provides a means of finding 
the optimum focus. This algorithm has the form 


M M : 
Z=} k+ dln) (17.1.6) 


where J, is the intensity of the image for pixel (i,k) of an (M x M) array 
centered on the image. The value of M selected is not critical but it should be 
large enough to span one or two bright rings around the Airy disk. For a properly 
sampled image with N = 2 pixels per length AF, a good choice is M = 7. 

The intensity is measured for an image at each of several focus positions. In a 
Cassegrain telescope, such as HST, this is simply done by moving the secondary 
mirror. The value of È is then computed for each image according to Eq. (17.1.6). 
These results, when plotted on a graph, show a maximum for È at best focus. 

We now outline an exercise applying this algorithm to diffraction-limited 
images of different wavelengths imaged with a given detector. If we choose 
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A = l um, F = 30, and pixel size A = 15 um, then there are two pixels spanning 
length AF’. For this correctly sampled wavelength, the value of X does not depend 
on the precise position of the peak of the PSF on the pixel array. Whether this 
peak is centered on one pixel or is at a corner of a pixel is of no consequence. If, 
however, a PSF of a shorter wavelength is used, then the PSF is undersampled 
(AF < 2A) and the position of the PSF on the array does affect the value of £. Not 
surprisingly, & is largest for an undersampled wavelength when the peak of the 
PSF is centered on one pixel. 


17.2. SIGNAL-TO-NOISE RATIO 


The performance of a system is most often given in terms of a quantity called 
the signal-to-noise ratio (SNR). The signal is the total number of detected 
photons on a given pixel, denoted here by n,. If the signal is recorded a large 
number of times under identical conditions, the mean signal is (n,) with a 
statistical fluctuation in the number of detected photons about the mean. This 
fluctuation is a consequence of the random arrival rate of the photons at the 
detector and the random selection of those that are detected. 

The variance or mean square noise in the signal is (n,), and the standard 
deviation or noise is n) . For an ideal detector, one giving only counts from the 
incident light, the SNR in the presence of photon noise is 


SNR = (n,)///(n,) = y). (17.2.1) 


For a real detector measuring a signal in the presence of a background source, the 
mean square noise from uncorrelated sources is the sum of the variances of the 
separate sources, and the SNR is 





SNR = (n,)/¥/(n,) + (np) + (na), (17.2.2) 


where (n,) is the noise from the background, and (ng) is the number of 
extraneous counts from the detector. The extra detector counts are due to dark 
counts and, in the case of solid-state detectors, read noise. The fractional 
accuracy for a given SNR is defined as 1/SNR, thus an observation with 1% 
accuracy requires a SNR of 100. It is evident from Eq. (17.2.2) that better 
accuracy is achieved with smaller (n,). 

We now consider two limiting cases of SNR for an ideal detector, one with 
zero background, the other with background large compared to the signal. Let S 
and B denote the incident signal and background flux, respectively, in photons per 
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second on an ideal detector. If the exposure time is ¢ and the quantum efficiency is 
Q, then the SNRs are 


SNR = /SO, B<«S, (17.2.3) 
SNR = /SOr../S/B, BS. (17.2.3b) 


It is instructive to examine these relations for SNR from two different perspec- 
tives. Consider first the situation where S and B are constant, and observations are 
made of the same source with different detectors and/or exposure times. In both 
cases we see that the SNR « ./Q71, hence a larger Q with a given exposure time or 
a longer exposure with the same Q gives a larger SNR. It is also evident that 
increasing the SNR by a factor of k requires a Qt product that is k? times larger. 

Alternatively, consider two sources with signal flux S} and S}, respectively, 
observed against the same background B to the same SNR level. From Eq. 
(17.2.3a,b) we get 


Sı _ Qh ; ae 
—= =, signal-limited, (17.2.4a) 
S Qt i 
1/2 
A = (2) ; background-limited. (17.2.4b) 
S2 Qiti 


As a numerical example, let t; = t, and Q,/Q, = 6.3. In the signal-limited case 
S, = S,/6.3; in the background-limited case we have S, = S,/2.5. Thus the more 
sensitive detector can observe a source that is two stellar magnitudes fainter in the 
signal-limited case, but only one magnitude fainter in the other case, for the same 
exposure time and SNR. 

From Eqs. (17.2.3) it is evident that the SNR achieved with a given detector is 
larger with a longer exposure time. Consider a single exposure of length ¢ 
compared to k separate exposures, each of length t/k. For each short exposure, 
the SNR is vk smaller than for the single long exposure for either case in Eqs. 
(17.2.3). By replacing t by k - t/k in Eq. (17.2.3) we see that 


SNR, = /k- SNR, (17.2.5) 


where the subscripts | and & refer to the long and short exposures, respectively. It 
follows, therefore, that the SNR of k added exposures is the same as that of a 
single long exposure, where the total exposure time is the same for both. 

As a final limiting case, consider the situation where the detector noise is large 
compared to either the background or signal. Assuming the detector noise is due 
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to both dark counts and read noise, the SNR for the detector-limited case can be 
written as 


SNR = SOt//Ct + R2, (17.2.6) 


where C is the dark count per second and R is the rms read noise. If read noise is 
negligible compared to dark count, the addition of k separate short exposures 
leads to the result given in Eq. (17.2.5). If R is dominant, however, then the SNR 
i k added exposures is smaller than that of a single long exposure by a factor of 

k. 

This treatment is sufficient to illustrate how system performance is specified in 
terms of SNR. We now use this parameter to give a more detailed analysis for 
both photometry and spectroscopy. 


17.3. DETECTION LIMITS AND SIGNAL-TO-NOISE RATIO 


Most telescope/instrument systems are used for observations that are at or 
near the limits of the system. These limits are due to source faintness, sky 
background, limited observing time, detector noise, or any combination of these. 
It is therefore important to determine how each of these affects the magnitude 
limit that can be reached at a given SNR. Treatments like the one that follows 
have been given by several authors, including Baum (1962), Code (1973), and 
Bowen (1964). References are listed at the end of the chapter. 

In this section we consider three types of observations and the relation 
between source brightness, exposure time, and SNR in the presence of various 
factors that degrade the SNR. Types of observations discussed include stellar 
photometry, slit-limited spectroscopy at various resolutions, and slitless spectro- 
scopy. For each observation mode we illustrate the general results with graphs for 
the HST and large ground-based telescopes of various diameters, using detector 
characteristics suitable for each. We assume in all cases that the light is collected 
by a single telescope; situations in which an array of telescopes sends light to one 
or more instruments are discussed by Cede (1973). 

We begin with the expression for the photon flux collected by a telescope of 
diameter D and transmitted to the detector. For a star of apparent magnitude m, 
the signal flux is 


S= Mza — &)D?AA - 10-4", 
= 0.7M 1D? Ad, - 10704", (17.3.1) 


where we set n(1 — ¢?)/4 = 0.7, assuming a typical s for a Cassegrain telescope. 
This factor is included in all the relations that follow. The remaining factors in Eq. 
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(17.3.1) are defined as follows: Nọ = 10* photons/(sec cm? nm) for a zero- 
magnitude AQ star at a wavelength of 550 nm, t is the system transmittance from 
the top of the atmosphere to the detector (not including slit losses), and Ad is the 
bandpass of the instrument used. For photometry the bandpass is defined by a 
filter; for spectroscopy the bandpass is set by the spectrometer. 

The photon flux from the sky background is given by 


B=0.7NotD* Ad’ - 107°4"' bg’, (17.3.2) 


where Ad’ is the bandpass of sky on the detector, m is sky brightness in 
magnitudes per arc-second squared, and ġġ’ is the detector area in arc-seconds 
squared projected on the sky. For stellar photometry and slit spectroscopy 
AX’ = Ad; for slitless spectroscopy the two bandpasses are different. 

In terms of photon flux, quantum efficiency Q, and exposure time ¢, we write 
Eq. (17.2.2) as 


KSQt KSQOt 


SNES SS 
(KS + B)Ot+ Cr+ R2  /KSOt+ (n,) 


(17.3.3) 





where C and R are the dark counts per sec and rms read noise, respectively, as 
used in Eq. (17.2.6), and (n,) is the sum of all contributors to the noise. 

The factor x in Eq. (17.3.3) is included to account for factors not included in 
the transmittance of the system. In some photometric modes, for example, some 
fraction of the flux in a stellar image may not fall on a given pixel or group of 
pixels. For the HST, for example, the fraction of the energy on a set of pixels 
centered on the image depends on the camera mode. The same is true for a 
ground-based telescope measuring an image with a Gaussian profile. For slit 
spectroscopy part of the image at the entrance slit may be intercepted by the slit 
jaws and not reach the detector, or the signal of interest may be the core of an 
absorption line. The factor x can account for these factors. 

Other useful forms of Eq. (17.3.3) are obtained by solving this relation for 
either m or t. We choose to solve Eq. (17.3.3) for m, with Eq. (17.3.1) substituted 
for S. The result is 


= (SNR)? 4(n,) \'? 


Representative results obtained from Eq. (17.3.4) for various combinations of 
parameters in different observation modes are given in the sections that follow. 

Before considering specific telescope and detector combinations, it is instruc- 
tive to look at two limiting cases for a noise-free detector, signal-limited and 
background-limited. In the former case we assume (ng) is negligible; in the latter 
case (n,,) = BQt and is large compared to the signal. 
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In the signal-limited case Eq. (17.3.4) becomes 


(SNR)* 
= —2.5 log] ———_ |, 17.3. 
K 2 oe RSD 739) 
while in the background-limited case 
(SNR $o" 
= 0.5m — 1.25 log] ————_—.— ].. 3. 
ee de oo es oe) 


We first consider the situation where observations for a fixed bandpass are made 
with different telescopes and/or detectors to the same SNR level. We also assume 
¢ = @’ and constant sky brightness. Starting with Eq. (17.3.5) or Eq. (17.3.6), we 
find the difference of the magnitudes reached as a function of the remaining 
variables, for the same SNR. For the signal-limited case we get 


Dy\? t Oot 
m, —m, =2.5 el (3) or (17.3.7) 
and for the background-limited case 
2 
ing ie S105 el (22$) 20), (17.3.8) 


From Eq. (17.3.7) we see that doubling the telescope diameter with all other 
parameters held constant gives Am = 1.5; for Eq. (17.3.8) the same conditions 
give Am = 0.75. Thus the faintness of a star observed to the same SNR is 
proportional to the telescope area in the signal-limited case, but only proportional 
to the telescope diameter in the background-limited case. 

We see from Eq. (17.3.8) that the faintness of a star observed to the same SNR 
is inversely proportional to the image area with all other parameters constant. For 
ground-based telescopes the importance of good seeing in reaching faint 
magnitudes is therefore evident. In the event that the image diameter is 
determined by diffraction rather than seeing, Eq. (17.3.8) is modified by replacing 
,/, by D,/D,, and the faintness of a star observed to the same SNR is again 
proportional to the telescope area. 

We now consider the situation where the same telescope and detector are used 
for observations made to different SNR levels. In this case the result for signal- 
limited observations is 


m, — m; = —S log (SNR,/SNR,), (17.3.9) 
and for background-limited observations 


m, — m, = —2.5 log (SNR,/SNR,). (17.3.10) 
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Therefore the slope in a log (SNR) versus magnitude plot is different by a factor 
of two in the two regions. 


17.4. DETECTION LIMITS: STELLAR PHOTOMETRY 


We now take Eq. (17.3.4) and give results for HST and ground-based 
telescopes of different aperture, each with the same detector and filter character- 
istics. The parameters used for the calculations are given in Table 17.3, with 
characteristics for the CCD taken from Table 17.1. 

We choose x= 0.8 for all telescopes and assume this fraction of the 
transmitted light contributes to the detected signal. For the wide-field mode of 
ACS, the fraction x of a stellar image falls on a few pixels. For a large ground- 
based telescope, the fraction k of an image with a Gaussian profile typically 
covers many pixels. The number of pixels j spanning the image is given, in 
appropriate units, as 


£ width = SFD(m)$(are-sec) (17.4.1) 
pixel size A(um) 

We assume the total read noise is proportional to j?, hence there is no on-chip 

summing before readout. The dark count is also, of course, proportional to j?. 
Results obtained from Eq. (17.3.4) with the parameters in Table 17.3 are 

shown in Fig. 17.5 for an exposure time of 2400 sec, the approximate time 

available in the dark part of one HST orbit. From the results in Fig. 17.5 it is 

evident that the ground-based observations are primarily in the background- 


Table 17.3 


Detector and Telescope Parameters 





Detector A = 15pm, R = Se~ rms/pixel, 
C = 0.003e7 /(pixel sec), k = 0.8, 
Q =0.8, 
Telescope t = (0.9) = 0.81 
Filter t = 0.8, AA = 100 nm (V-band) 
Relay Optics t=0.5 
Other m [mag/(arc-sec)}?] ġ (arc-sec) F 
WFC of ACS 23 0.1 26 


Ground 22 1.0 2.5 
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Fig. 17.5. Representative SNR-apparent magnitude diagram for photometry. Values of the 
parameters are given in Table 17.3. D =4m (dashed line); D = 10m (solid line); HST (dotted line). 


limited region, while those with the WFC of ACS are in the transition between the 
signal-limited and background-limited regimes. 

It is also of interest to note that the noise contribution of the detector is 
negligible, compared to the sky background, for each of the ground-based 
telescopes at the chosen focal ratio. For each of these telescopes the sky 
contribution is approximately 1000 times larger than the detector noise. For the 
HST, on the other hand, both the sky and detector noise are smaller, especially the 
former. The net result for this camera mode of the HST is detector noise 
comparable to the sky noise. At small SNR the effect of the detector noise is 
to reduce the limiting magnitude by about 0.4 magnitudes. 

The results for the ground-based telescopes in Fig. 17.5 are based on a sky 
brightness of 22 mag/square arc-sec. If the sky is fainter by 0.5 magnitudes, the 
curves for these telescopes are shifted to the right by 0.25 magnitudes, as shown 
by Eq. (17.3.6). This change is not as pronounced as that obtained with better 
seeing; for an image diameter smaller by a factor of two the curves are shifted to 
the right by 0.75 magnitudes. 

The effect of better seeing is especially significant if adaptive optics are used 
to sharpen the image and increase the Strehi ratio. If, for example, the effective 
image size is reduced to 0.2 arc-sec, then the curves for the ground-based 
telescopes in Fig. 17.5 are shifted to the right by 1.75 magnitudes. 

The results in Fig. 17.5 are intended only as an illustration of the relation 
between SNR and limiting magnitude for one given set of parameters. Because of 
the many variables that influence this relation, each particular telescope-filter 
combination requires its own set of calculations. 
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17.5. DETECTION LIMITS: SPECTROSCOPY 


Calculation of detection limits for spectroscopy requires a careful analysis of 
the type of spectrometer and the mode in which it is used. In this section we 
consider slit and slitless modes, and stellar and extended sources. Selected 
examples of representative telescope/spectrometer combinations are given. 


17.5.a. SLIT SPECTROSCOPY-STELLAR SOURCES 


The calculation of SNR for slit spectrometers proceeds by using Eqs. (17.3.3) 
and (17.3.4) modified for different source or observing conditions. For a stellar 
source there are two cases—the star image fits entirely within the slit, or part of 
the image falls on the slit jaws. If the star image is entirely within the slit we set 
K = Kg; if part of the image is intercepted by the slit we set k = Ko(¢/’), where 
Q’ is the diameter of the image and ¢ is the slit width projected on the sky. The 
slit-limited case is the usual one with large telescopes, especially at high 
resolution. The factor kg is the level of the signal relative to the continuum. 
For an emission line ky > 1; in the core of an absorption line ky < 1. 

Incorporating this factor into Eq. (17.3.4) gives 


B (SNR)*(¢'/¢) 4(n,) \ 





In the case where ġ = ¢’, Eq. (17.5.1) is the same as Eq. (17.3.4) at the 
continuum level and the limiting cases for a noise-free detector are given by Eqs. 
(17.3.5) and (17.3.6). The only difference for a spectrometer is AA = Pw’, where 
P is the plate factor and w’ is the projected slit width in the spectrometer focal 
plane. 

In the slit-limited case we use Eq. (12.2.1) to express ¢ in terms of the 
projected slit width, hence ¢ = w’/rDF). This is required because ¢ and AA are 
not independent of one another. With this substitution into Eq. (17.5.1) we see 
that one factor of D is canceled. The limiting cases for a noise-free detector are 


2 1 
m= -251| GNR ao | (17.5.2) 





0.7NoKotPW2DQt 


in the signal-limited case, and 


2 13 
Se | (17.5.3) 


m = 0.5m’ — 1.25 log 7 
0.7NoK3tPw2DOt 


in the background-limited case. 
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Table 17.4 


Spectrometer Parameters 


Grating Echelle 
600 grooves/mm, m = 1 tan 6 = 2.0 
Fy 1.5 2.0 
y 0.9 0.7 
P(nm/mm) 5.56 0.368 
A(at 500 nm) 3000 45,000 
t(system) 0.15 0.10 


Other: ġ' = 1 arc-sec, Ky = 1, d) = 200mm. 
Projected slit w spans two pixels (30 um). 
Sky background = 22 mag/(arc-sec)*. 


From these relations we see that the faintness of a star observed to the same 
SNR is proportional to D in the signal-limited case, and proportional to V/D in the 
background-limited case. Note that for a diffraction-limited telescope we have 
¢’ x 1/D, and the faintness reached at a given SNR is proportional to D?. 

If a given system is used to make observations to different SNR levels, the 
relations obtained from Eqs. (17.5.2) and (17.5.3) are the same as those given in 
Eqs. (17.3.9) and (17.3.10), respectively. Hence the slope in a log (SNR) versus 
magnitude plot is again different by a factor of two in the two regions. 

We now give results derived using Eq. (17.5.1) for ground-based telescopes 
with representative spectrometers. The parameters of each spectrometer are given 
in Table 17.4; the detector used is a CCD with parameters given in Table 17.3. 
Results obtained from Eq. (17.5.1) with the given parameters are shown in Fig. 
17.6 for an exposure time of 2400 sec. The choice of kọ = 1 indicates observa- 
tions at the continuum level of the stellar spectra. The detector noise is larger than 
the sky noise in the echelle mode by a factor of about 40; the detector and sky 
noise are comparable for the grating mode. Entrance slit widths are approximately 
1.1 and 0.45 arc-sec for the 4- and 10-m telescopes, respectively. 

In both modes the magnitude reached at SNR = 100 is close to that found from 
Eq. (17.5.2) for the signal-limited case. The curvature evident in Fig. 17.6 as SNR 
decreases indicates a transition to the detector-limited regime. 


17.5.b. SLIT SPECTROSCOPY-EXTENDED SOURCES 


The photon flux of an extended source collected by a telescope of diameter D 
and transmitted to the detector is given by 


S =0.7NptD?Ad - 10-°*" 6g’, (17.5.4) 
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Fig. 17.6. Representative SNR-apparent magnitude diagram for spectrometry. Parameters for the 
detector are in Table 17.3, for the spectrometers in Table 17.4. D = 4 m (dashed line); D = 10 m (solid 
line). Curves on the left are for an echelle, those on the right for a first-order grating. 


where m is the source brightness in magnitudes per square arc-sec and ọọ’ is 
again the detector area in square arc-sec projected on the sky. For a source whose 
spectrum is a composite of stellar spectra, the parameter Nọ has the nominal value 
given following Eq. (17.3.1); for an emission line source the product NyAd is 
adjusted to the proper flux value. The flux from the sky background is given by 
Eq. (17.3.2). 

For both source and sky the flux is proportional to D*#¢’, and using Eq. 
(12.2.1) we get D?¢’ = w’h'/rF3. Hence the SNR depends on the camera focal 
ratio but is independent of the telescope diameter. Substituting this result in Eq. 
(17.3.3), we find that 


SNR «x (rF3)'””. (17.5.5) 


It is evident from Eq. (17.5.5) that SNR is larger for smaller F}. This result is in 
accord with that given in Eq. (12.2.17) for the irradiance of an image. Thus 
observations of an extended source to a given SNR level, either through a 
spectrometer or filter, is done in a shorter time with a faster camera. 


17.5.c. SLITLESS SPECTROSCOPY 


For observations of stellar sources in the slitless mode, the source and 
background signals are given by Eqs. (17.3.1) and (17.3.2), respectively. In this 
mode A4 = Pw’ for the star signal, while Ad’ is the bandpass for all wavelengths 
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transmitted to the detector. For the background-limited case, assuming a noise- 
free detector, the only one considered here, we find 


(SNR? O'G (AX 
= 0.5m = 125 lég| >x 2 P_ (24), 17.5.6 
ETRA se ei AA (173:6) 


where x is set equal to one. Note that Eq. (17.5.6) is simply Eq. (17.3.6) modified 
for the case of different spectral and sky bandpasses. The effect of the bandpass 
ratio is to give a brighter magnitude reached for a given SNR, assuming all other 
parameters are the same. 

To illustrate the effect we assume a nonobjective transmission grating mode 
with a spectral resolving power # = 100. If Ad’ is taken as 4/2, where A is the 
wavelength at the blaze peak, then the bandpass ratio AJ’/AA is approximately 
50. The difference in magnitude with and without this factor is 
Am = —1.25 log (50), or about 2.1 magnitudes. Although the limiting magnitude 
is brighter for this slitless mode, the gain is 50 spectral elements per source 
compared to one obtained with a narrow filter in a photometry mode. 
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Chapter 18 Large Mirrors and Telescope Arrays 


The most challenging aspect in constructing a telescope with large light 
gathering power and a capability of achieving high angular resolution is the 
primary mirror. In this chapter we discuss various aspects of building and testing 
mirrors, especially large concave mirrors, and the effects of residual errors on 
image quality, including errors that arise in the support of such mirrors. We also 
consider telescopes used in concert to achieve angular resolution beyond that 
possible with a single telescope. 


18.1. LARGE MIRRORS 


The history of large mirrors is closely linked to the technologies of casting 
glass and polishing the surface to the desired shape. As these technologies 
developed it became possible to make both larger and faster monolithic mirrors. 
During the 1970s and 1980s it also became clear that the mass per unit collecting 
area had to be reduced. It also became evident during this time that mirrors in the 
8-m class were approaching the size limit for single glass slabs and that 
segmented mirrors would be required for larger primaries. The outcome of 
these developments is a wide variety of mirror types, as shown in Table 18.1, with 
the order of the mirror examples given decreasing in mass per unit collecting area 
from top to bottom. 
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Table 18.1 


Types and Examples of Large Primary Mirrors 





Type Example 

Monolithic (rigid) 

Solid (thick) Hooker 2.5-m 

Cast with cores Hale 5-m 

Spin-cast honeycomb MMT conversion 6.5-m 

Slumped honeycomb HST 2.4-m 
Monolithic (flexible) 

Solid (thin) Gemini, VLT 8-m 
Segmented 

Sphere (91 segments) Hobby-Eberly 9-m 


Paraboloid (36 segments) Keck 10-m 


18.1.4. MIRROR SHAPING 


The figure is the basic prescription for the shape of a mirror, as given by Eq. 
(5.1.1). Most telescope primaries, either monolithic or segmented, have an 
aspheric figure, paraboloidal or hyperboloidal, although spherical mirrors are 
the likely choice for mirrors of the future, say 15m or larger. Techniques for 
shaping large mirrors have evolved in response to the demands for faster mirrors. 

For an aspheric mirror of conic constant K the local curvature on the surface is 
a function of the radial distance r from its center, as shown by Eq. (3.5.6) and 
restated here. The local radius of curvature is 


Ri, = Rll — K(e?/16F*)P”, (18.1.1) 


where F =|f|/D, R is the vertex radius of curvature, and r = ¢D/2, with 
0 <€< 1. The change in curvature is more significant for smaller focal ratios 
and faster mirrors. As an example, for a paraboloid with K = —1 and F = 1.25, 
the radius of curvature at the edge is about 6% larger than at the center. 

These significant differences in curvature across a mirror have resulted in new 
techniques for polishing mirrors to the required precision. One such technique is 
computer-controlled polishing with small tools, such as used on the f/2.3 
primary of HST with a lap approximately 5cm in diameter. Another technique 
is stressed-lap polishing in which the surface of a rotating lap is continuously 
adjusted to conform in radius to that part of the mirror below it, as used for the 
6.5-m, f/1.25 primary mirror of the MMT conversion (see Martin et al. (1998)). 

The segmented approach has been adopted for primaries of diameter larger 
than about 8m. The 9-m primary of the Hobby-Eberly Telescope (HET) and the 
primaries of the Keck 10-m telescopes (TMT) are examples of segmented 
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mirrors. The figure of each hexagonal section of HET is spherical and making 
high quality segments is relatively easy. For the TMTs, on the other hand, the 
overall figure is paraboloidal and the figure of a hexagonal section depends on the 
location of that section in the primary. 

The technique used to polish the sections of the TMTs is called stressed- 
mirror polishing. A hexagonal mirror is bent to a prescribed shape and a spherical 
shape is polished onto the surface of the stressed glass. When the stresses are 
removed and the glass returns to its unstressed state, the surface has the desired 
shape of a section of an off-axis paraboloid (see Mast and Nelson (1990)). 

An advantage of the segmented approach is that primaries of very large size 
are possible. An obvious disadvantage of this approach is the complexity of the 
structure required to hold the individual modules in their proper positions. For 
details on the modular approach to primary mirrors, the interested reader should 
consult the literature on the TMT project. 


18.1.b. TESTING 


Aberration characteristics of telescopes discussed in previous chapters are 
derived assuming the mirrors are essentially perfect. A mirror produced in the 
optical shop is not perfect, however, but if the actual figure agrees with the 
prescription to within a specified tolerance the telescope performance is not 
noticeably degraded. In this section we discuss briefly the principal method used 
to ensure that a mirror figure is within a given tolerance. This discussion is only a 
quick look at a large subject and the interested reader should consult the literature 
for details. A good overview of a large number of testing methods is given in the 
book edited by Malacara (1978) cited at the end of the chapter. 

For our discussion we assume that a mirror is perfect if it is diffraction-limited, 
that is, the rms deviation of the reflected wavefront compared to the ideal 
wavefront is <//14, where A is the test wavelength. The definition of rms 
wavefront error and details on the origin of the stated condition are given in 
Section 10.3. The sensitivity of the test method discussed here is considerably 
better than the diffraction-limited criterion; wavefront errors of ~ 4/100 can be 
measured. 

The method most widely used for testing large concave mirrors is Twyman- 
Green interferometry, and the test device is a modified Michelson interferometer, 
as shown in Fig. 18.1. For testing a concave mirror, a Michelson interferometer, 
shown schematically in Fig. 13.16, is converted into a Twyman-Green by 
replacing the moveable mirror B in Fig. 13.16 with a positive lens, usually 
called a null lens, and the test mirror. The lens and mirror are placed so their 
optical axes coincide, with the focal point of the lens coincident with the center of 
curvature of the mirror. The other change is to use a laser plus beam expander as 
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D 


Fig. 18.1. Twyman-Green interferometer. A, fixed mirror; L, null lens; M, test mirror; D, detector. 


the light source, with the pinhole in the beam expander placed at the focal point 
of the input collimator lens. 

The beamsplitter divides the amplitude of the incident plane wavefront, which, 
after reflection in the two arms of the interferometer, returns to the beamsplitter. A 
portion of the two reflected beams is recombined at the beamsplitter and the 
superposition of the beams is directed toward the output lens and an array 
detector, the latter designated D in Fig. 18.1. The detector D must be placed in a 
plane conjugate to the pupil of the instrument and the interferogram at the pupil 
must represent the wavefront distortions introduced by the mirror under test. 

If the null lens-mirror combination is perfect, and the reflected beam in each 
arm retraces exactly the path of the light from the beamsplitter, then the 
illumination at D is uniform. If, instead, the plane mirror designated A in Fig. 
18.1 is tilted, a set of straight fringes with equal spacing is recorded by the 
detector. The spacing of the fringes is inversely proportional to the tilt angle. Each 
fringe traces a zone of equal optical path difference between the two inter- 
ferometer arms; the recorded fringes are of the equal thickness type. 

If the lens-mirror combination is not perfect and mirror A is tilted, then a set of 
distorted or curved fringes is seen. From measurements of the positions of the 
fringes at many points on the interferogram it is possible to find the shape of the 
distorted wavefront, including tilt. If this tilted wavefront is then represented in 
terms of a linear combination of Zernike polynomials, it is a simple matter to 
remove the tilt and recover the actual wavefront. Examples of fringe patterns seen 
with various types of aberrations in the test element and various tilts of mirror A 
are displayed in the book edited by Malacara (1978). 
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It should be evident from this discussion that the test using the Twyman-Green 
interferometer is one of the null lens and mirror combination, not of the mirror by 
itself. It is necessary, therefore, to test the lens independently and verify that it has 
the required characteristics. If, for example, the test mirror is a sphere, then the 
null lens by itself must be perfect. If, on the other hand, the test mirror is a conic 
other than a sphere, then the null lens must introduce an amount of spherical 
aberration that exactly compensates the spherical aberration of the mirror in the 
test setup. Such a null lens was required, for example, in the testing of the 2.4-m 
hyperboloidal primary of the Hubble Space Telescope with a Twyman-Green 
interferometer. Unfortunately, this lens was not independently checked following 
reassembly of its elements and the result was an HST primary built to a different 
conic constant than called for by the design. 

As noted at the start of this section, this is only an introduction to one method 
of optical testing. We have not considered other methods used in the past with 
large mirrors, such as the Foucault or knife-edge test and Hartmann screen test. 
The latter method was used to test the 4-m mirrors made at the Kitt Peak National 
Observatory during the early 1970s. We have also omitted any discussion of test 
methods for convex conic mirrors, such as Hindle type tests. For details on these 
and other test methods, and further references to the literature, the reader should 
consult the Malacara reference. 


18.1.c. RESIDUAL ERRORS AND SCATTERED LIGHT 


Analysis of interferograms obtained during the testing of a mirror leads to a 
detailed contour map of the hills and valleys of the mirror surface relative to the 
prescribed surface. Terms in the Zernike expansion representing the errors of low 
spatial frequency on the wavefront, such as spherical aberration, coma, and 
astigmatism, provide an approximate measure of the mirror quality. Polishing and 
testing continue until these aberrations are brought to a specified level expressed 
in waves rms. 

If these low frequency errors are removed from the interferogram, the residuals 
are attributed to randomly generated mid- and high-frequency spatial errors. From 
our discussion in Section 11.1 we find the representation of these errors in terms 
of degradation functions in Eqs. (11.1.15) and (11.1.16). These equations, 
repeated here, are 


T,, = exp {—k’w?[1 — c(v,)]}, (18.1.2) 


T, = exp (—K’ aj), (18.1.3) 


where m and h denote mid- and high-frequency, respectively, k = 27/4, œ is the 
rms random wavefront error, v, is the normalized spatial frequency, and c(v,,) is 
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the normalized autocorrelation function. See Section 11.1 for a discussion of 
these degradation factors. 

As the errors represented by these factors are independent of one another and 
of the figure error, they can be combined in quadrature. Hence 


wo = w} + on, + Oj, (18.1.4) 


where œw? is the mean square error (mse) on the wavefront for errors at all spatial 
frequencies. The overall quality of the mirror is thus given in terms of the separate 
errors and the combined error. 

We discussed some of the effects of the mid- and high-frequency errors on the 
point spread function (PSF) in Section 11.1, and extend that discussion here. As 
noted in Section 11.1.d, the effect of T, on the PSF computed with Eq. (11.1.12) 
is to decrease the PSF by this factor at all points on an image. The same reduction 
also occurs in the encircled energy (EE), as seen by inspection of Eq. (11.1.13). 

Thus the fractional changes in the PSF and EE are given by 


APSF AEE 12.2 
PSF ~ EE = 1 — exp (-P a’) S Po’. (18.1.5) 
The approximation in Eq. (18.1.5) is good for œ ~ 2/20 or smaller. 

The effect of the high frequency error on the mirror is to scatter a fraction of 
the light at large angles from the PSE, hence the light is effectively lost. A similar 
displacement of light from the core of the PSF also occurs when mid-frequency 
errors are present on a mirror. For these errors the scattered light from the inner 
part of the PSF appears in the outer rings of the Airy pattern, as shown in Figs. 
11.7 and 11.8. The fractional changes in the Strehl ratio and EE within a few Airy 
rings for mid-frequency errors are also given by Eq. (18.1.5). 


18.1.d. MIRROR SUPPORT AND PRINT-THROUGH 


Another factor affecting image quality is print-through, undulations on a 
mirror surface introduced either during the polishing process or by the support 
structure. For rigid honeycomb mirrors the thickness of the mirror across its face 
varies in a regular pattern. If the thinner sections of the mirror face are depressed 
slightly during the polishing, the result is a regular pattern of bumps when the 
polishing tool is removed. A similar pattern can be polished into a thin flexible 
mirror supported at a finite number of points during the polishing process. Even 
with a perfect thin mirror, print-through can be introduced if the support structure 
in the telescope does not have sufficient support points. 
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In this section we consider briefly the print-though for a square array and 
outline the approach to analyzing its effect. As a starting point we assume a 
surface deformation 6¢ of the form 


ôÇ = Acos (7) cos (=), (18.1.6) 


where ô¢ is the difference from the prescribed figure, 2A is the peak-to-valley 
amplitude, (č, n) is the coordinate frame on the mirror, and / is the center-to- 
center spacing between undulations. A contour map for this ô¢ is shown in Fig. 
18.2. 

The effect of this surface error on the PSF is found by noting that the 
wavefront error ® = 2 ô¢, substituting ® into the aperture function in Eq. 
(10.5.1), and integrating Eq. (10.5.1) to find the amplitude U(x, y) in the image 
plane. For our purposes we consider only the effect on the peak intensity i(0). 
Following the procedure in Section 10.3 for a square aperture of side / we find 

1/2 
(®") = =| O"dédn. (18.1.7) 
=1/2 


Fig. 18.2. Print-through pattern according to Eq. (18.1.6). 
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Computing the rms wavefront error according to Eq. (10.3.8) gives w = A. For an 
otherwise perfect mirror, the Strehl ratio S = 1 — k?œ?. As an example, if 
S=0.9 at 1 =500nm, then 4 = 25nm, œ = 0.05 waves, and the peak-to- 
valley amplitude is 50 nm. 

Thus the effect of the print-through pattern is to remove light from the core of 
the PSF and transfer it outward in the PSF. Unlike the radial scattering from 
random errors, the regular pattern of the undulations causes the light to appear in 
subsidiary peaks around the otherwise radially symmetric PSF. These side peaks 
are located at the secondary maxima derived from Eq. (10.1.6), with the main 
subsidiary peaks located at angles of approximately 4// from the PSF peak. 
Calculations for this example show that the maximum intensity of a subsidiary 
peak is ~1% of that of the main PSF. 

Similar calculations for other patterns, such as hexagonal undulations, give 
comparable results. The principal difference is the location of the subsidiary 
peaks around the PSF. 


18.1.e. CONCLUDING REMARKS 


The art and science of putting the proper figure on a large concave mirror must 
deal with many factors, as the discussion in this section has indicated. From lack 
of smoothness at a microscopic level to a large scale print-through pattern, errors 
over a large range of spatial frequencies contribute to image degradation. 

Given the typical requirement that a telescope system from end-to-end, not 
including the atmosphere, be approximately diffraction-limited, it is not surpris- 
ing that the requirement for its large primary mirror must be set at 2/20 or 
smaller. For a space telescope the specification for the primary mirror is typically 
4/40 or smaller. Except for the built-in spherical aberration of the HST primary, 
this specification was met. 


18.2. TELESCOPE ARRAYS; INTERFEROMETERS 


The usual configuration of an optical telescope is one with a single circular 
aperture and an angular resolution in the diffraction limit of order 4/D. This 
dependence on wavelength and diameter applies to any telescope, thus the largest 
single aperture radio telescope has an angular resolution several orders of 
magnitude poorer than that of any optical telescope of modest size. To overcome 
this limitation, radio telescope configurations with multiple apertures have long 
been used to achieve high angular resolution. Given the resolution limit of A/D, 
this means a large effective D, hence a long baseline across an array of phased 
telescopes or interferometers. 


452 18. Large Mirrors and Telescope Arrays 


In contrast to the case for radio telescopes, angular resolution of ground-based 
optical telescopes is almost always limited by the atmosphere; in addition, 
stringent requirements for optical interferometers worked against their develop- 
ment. With the drive for higher angular resolution, however, increased attention 
has been given to the possibility of phased arrays, especially at infrared 
wavelengths. 

Optical path differences between the elements of an interferometer are 
measured in fractions of a wavelength and the technical problems of maintaining 
differences to this accuracy are formidable. Our approacii is to assume that the 
technical problems are solved and that the elements of an array are properly 
phased. 


18.2.4. DIFFRACTION IMAGES 


In this section we describe the diffraction image given by selected phased array 
configurations, each of which has some number N of identical telescopes with 
circular apertures. Arrangements discussed include linear arrays with an even 
number of equally spaced telescopes, and a square array with N =4. For a 
discussion of these and other types of arrays, including an atlas of diffraction 
images, the reader should consult the excellent article by Meinel et al. (1983) 
listed in the references. 

The calculation of the diffraction image of an array of telescopes is done by 
using the array theorem given in Section 10.5.c. Consider a set of identical 
telescopes or, equivalently, exit pupils distributed as shown in Fig. 18.3. For 
perfect telescopes the wavefront at each exit pupil is part of a spherical surface 
whose center is at O in Fig. 18.3. The amplitude U at a point P in the image is 
found by evaluating Eq. (10.5.5), where each term in the sum locates the center of 
one of the N telescopes. When this is done for N circular apertures, each of 
diameter D and focal length f, the result is 


2J (v) & 


U(x, y) =C 7 2 exp [—ik(pé, + qn;)), (18.2.1) 
j= 


where C is a normalization constant, v = nDa/A [see Eq. (10.2.10)], p = x/f, 
q =y/f, and (¢;, n;) are the coordinates at the center of the jth telescope. 

Note that the factor in front of the sum in Eq. (18.2.1) is the amplitude for a 
single circular clear aperture, as given in Eq. (10.2.7). The sum in Eq. (18.2.1) 
represents the superposition of the amplitudes from the N apertures treated as 
coherent sources, where k(p¢; + qn;) is the phase difference between a wave from 
the center of the jth telescope and one from the origin of the (€, n) coordinate 
system. 
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Fig. 18.3. Schematic layout of telescopes, each of focal length f, with pupils in the čy- plane and 
combined image in the xy plane. 


The intensity at point P of the diffraction pattern is given by /(P) = |U (P), 
with the normalized intensity i(P) = /(P)/I(O). In effect, the intensity of the 
diffraction pattern is that of a single telescope modulated by an interference 
pattern produced by the separate telescopes. This is similar to the result given in 
Eq. (13.4.1) for a diffraction grating, with the intensity the product of a blaze 
function and an interference factor. We now evaluate Eq. (18.2.1) for selected 
arrays of telescopes. 


18.2.b. LINEAR ARRAYS 


A linear array of N equally spaced telescopes, where N is even, is shown in 
Fig. 18.4. We put the origin of the (€, 7) coordinate system in the center of the 


n 


Ty FS 


Fig. 18.4. Linear array of telescopes, each of diameter D, with spacing yD between centers. 
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array for convenience in evaluating the sum in Eq. (18.2.1). The center-to-center 
spacing of adjacent telescopes is yD, with y > 1 to ensure no overlap. The 
&-coordinates at the centers of the telescopes are +yD/2,+3yD/2,..., 
+(2N — 1)yD/2. Substituting into Eq. (18.2.1) we find 





Nie sin (Nyv,) 
E=20 cos [(2j — 1)yv,] = —— 5, 18.2.2 
È cos [2 = Died =F Go (18.2.2) 
where v, = nDx/Af. Therefore the normalized intensity is 
up, 1 [2AN (siny)? 
i(P) = zl > ) aar) (18.2.3) 


The smallest value of v, for which i(P)=0 is given by Nyv, =n, hence 
a, = A/DNy, where a, is the angular resolution in the x direction. 

Note that the distance spanned by the telescopes of the array is 
y(N — 1)D + D, hence the angular resolution in this direction is essentially that 
of a single telescope of this diameter. The resolution in the y direction is the same 
as that of a single telescope of diameter D, as is evident by setting v, = 0 in Eq. 
(18.2.3). 

Results obtained from Eq. (18.2.3) are shown in Fig. 18.5 for N = 2 and two 
values of y, in a slice across the image in the x direction. Note the narrowing of 
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Fig. 18.5. Normalized PSFs for pairs of two telescopes. Angular distance from peak to first 
minimum is 4/2yD. 
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Fig. 18.6. Normalized PSF for linear array of four telescopes. Angular distance from peak to first 
minimum is A/NyD. 


the central peak in Fig. 18.5 as y increases. We also see that there are more 
subsidiary peaks under the diffraction envelope when y is larger, with a larger 
fraction of the total energy in these peaks. These additional peaks make it more 
difficult to achieve the resolution given by the central peak. 

In Fig. 18.6 we show a Slice across the image in the x direction for N = 4 and 
y = 2. Note that the principal peaks under the diffraction envelope are sharper, a 
consequence of the larger Ny product, with weak subsidiary bands between the 
principal peaks. In general, there are N — 2 weak bands between adjacent strong 
bands. Contour maps of the images with N = 2, y = 2, and N = 4, y =2 are 
shown in Fig. 18.7. 


(a) (b) 


Fig. 18.7. Contour maps of PSFs from linear arrays with y = 2: (a) N = 2, (b) N =4. 
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18.2.c. SQUARE ARRAY 


As a final example we consider the square array shown in Fig. 18.8. Applying 
Eq. (18.2.1) to a square array we find that the normalized intensity is 


(p 1 (ZY (sin Qe) sin 270y) : 
A Tako) ma aun) 


Results from Eq. (18.2.4) are shown in Fig. 18.9 for y = 2. Note that the profile 
along the x and y directions is the same as that given in Fig. 18.5 for y = 2. This is 
expected because the array spans the same distance, 2D between centers, in both 
cases. In general, the FWHM of the central peak is inversely proportional to the 
spacing factor y. The contour map for the square array with y = 2 shows a set of 
nine peaks in a 3 x 3 array. 


(18.2.4) 


18.2.4. CONCLUDING REMARKS 


These examples are sufficient to illustrate the general features of diffraction 
images of telescope arrays, with Eq. (18.2.1) as the basis for treating other 
configurations. We assumed the same D for telescopes in each of our examples, 
but this is not a requirement when using Eq. (18.2.1). 

For a discussion of arrays of small telescopes utilized as stellar interferom- 
eters, the reader should consult the reviews by Labeyrie (1978) and by Shao and 
Colavita (1978). Examples of large telescopes projects planning for an interfero- 
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Fig. 18.8. Square array of telescopes of diameter D, spacing yD. 
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Fig. 18.9. Profiles of normalized PSFs for square array with y = 2. 


metric mode are the twin Keck 10-m telescopes, the Very Large Telescope array, 
and the Large Binocular Telescope. Although most observing will continue to be 
done with individual telescopes, it is evident that interferometers will make 
possible observations of unprecented angular resolution. 
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Chapter 2 


Symbol Meaning 


d distance between surfaces 

diameter of lens or mirror 

focal length; object (image) distance for image (object) at infinity 
focal ratio 

object (image) height 

Lagrange invariant 

angle of incidence (refraction) 

transverse magnification 

angular magnification 

index of refraction in space with incident (refracted) ray 
power of surface or optical system 

radius of curvature at vertex of surface 

object (image) distance 

telescope scale 

slope angle of ray before (after) refraction 

distance from stop to surface 

coordinate direction along axis of optical system 
normalized distance from exit pupil to telescope focus 
displacement of object point by plane-parallel plate 
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Symbol 


AA 
ASA 
LSA 
j 
TSA 


a, P, h 
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aS 
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slope angle of normal to surface 
angle between chief ray projected on sky and telescope axis 
angle between chief ray from exit pupil and telescope axis 


Normalized parameters for two-mirror telescopes—see Table 2.1. 


Meaning 


projected width of prism face 

eccentricity of conic section 

shorthand representation of optical path length 
conic constant 

length of oblique ray before (after) refraction 
optical length 

optical path difference 

optical path length 

local radius of curvature on optical surface 
prism base length 

coordinate on optical surface 

slope angle of ray 

zenith angle of ray 

sag of mirror surface 

prism angle 

wavelength 

curvature of path of light ray in space 


Meaning 


angular aberration 

angular spherical aberration 

longitudinal spherical aberration 

transverse distance from axis of optical system 
transverse spherical aberration 

angles of rays 

angular deviation of ray by Schmidt corrector plate 
wedge angle of corrector plate 

normalized distance from axis of Schmidt camera 
thickness of Schmidt corrector plate 
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LCA 
Q 
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Symbol 


Qs 


SA 
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Meaning 


aberration coefficient 

angular astigmatism 

angular distortion 

angular sagittal coma 

angular tangential coma 

aspheric coefficient 

aberration coefficient 

transverse offset of chief ray intersection at surface 
transverse offset of pupil or stop 
location of Petzval surface 

location of sagittal astigmatic image 
location of tangential astigmatic image 
transverse aberration 

transverse astigmatism 

transverse sagittal coma 

transverse tangential coma 
transverse distortion 

sag of object or image surface 
curvature of object or image surface 
optical path difference (OPD) 
astigmatism factor 


Meaning 


longitudinal chromatic aberration 
wavefront error 


Meaning 


vertex curvature of optical surface 

chromatic spherical aberration 

fourth-order coefficient for aspheric surface 
sixth-order coefficient for aspheric surface 

distance from stop to corrector plate 

Abbé number for glass 

expansion factor for Schmidt plate with displaced stop 
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Chapter 8 

Symbol Meaning 

o normalized distance of corrector plate in Schmidt-Cassegrain 
Chapter 10 

Symbol Meaning 

amn aberration coefficients in fractions of waves 

a,b sides of rectangle 

A area 

A(é,) aperture function 

aS area element at pupil 

EE enclosed energy fraction 

FWHM full-width at half-maximum of peak intensity 

F energy or light flux 

i normalized intensity 

I intensity or irradiance 

k 2n/Aa 

OE 1-EE 

p.q direction cosines 

PSF point spread function 

R radius of spherical wavefront 

S Strehl ratio 

u dimensionless diffraction variable in axial direction 
U wave amplitude 

v dimensionless diffraction variable in radial direction 
w v/n 

a field angle projected on the sky 

E linear obscuration ratio 

Ene coordinates for wavefront at pupil 

p fractional radius of point within a circular aperture 
w root-mean-square (rms) wavefront error 

Chapter 11 

Symbol Meaning 

c(v,,) normalized autocorrelation function 

Ca contrast 

l normalized autocorrelation length 
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Pa spatial period 

T modulation transfer function (MTF) 
T; MTF degradation factor 

V, Va spatial frequency, normalized spatial frequency 
Vo cutoff spatial frequency 

d',o standard deviation, normalized standard deviation 
Y optical transfer function (OTF) 
Chapter 12 

Symbol Meaning 

A angular dispersion 

B photometric brightness 

d spectrometer beam width 

E irradiance 

FRD focal ratio degradation 

h, W slit height, actual and projected 

L luminosity 

P plate factor 

r anamorphic magnification 

R spectral resolving power 

S area 

U etendue 

w, w’ slit width, actual and projected 

«x, B angles before (after) dispersing element 
A pixel size 

oA limit of resolution 

op subtended angles on the sky 

T transmittance 

Q solid angle 

Chapter 13 

Symbol Meaning 

b, b' actual (effective) groove width 

BF blaze function 


m 
N 
Nr 
R 


diffraction grating order number 
total number of grooves on grating 
reflective finesse 

reflectance 
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T transmittance 

Ww grating width 

a, B grating angle of incidence (diffraction) 
ô grating blaze angle 

AL free spectral range 

Ap blaze wavelength 

0 in-plane grating angle 

o diffraction grating groove spacing 
Chapter 14 

Symbol Meaning 

Yy off-plane grating angle 

p radius of curvature of spectral line image 
Chapter 15 

Symbol Meaning 

N prism index of refraction 

6A, spectral coma 

E prism orientation parameter 

y prism angle 

Chapter 16 

Symbol Meaning 

c(v,,) normalized autocorrelation function 

H height of turbulent layer in atmosphere 
ro seeing parameter, Fried parameter 

T modulation transfer function (MTF) 

T; MTF degradation factor 

Xo angular limit of resolution 

y zenith angle 

V, Va spatial frequency, normalized spatial frequency 
y cutoff spatial frequency 

Oo isoplanatic angle 

On isoplanatic angle for image motion 
d',o rms deviation from image peak, normalized deviation 
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Chapter 17 

Symbol Meaning 

B incident background flux 

C dark count rate 

m apparent stellar magnitude 

m sky brightness in magnitudes per arc-second squared 
in) mean signal 

N number of pixels per length 1F 
Q quantum efficiency 

R rms read noise 

S incident signal flux 

SNR signal-to-noise ratio 


aga S 
S 


exposure time 

transmittance modifier 
subtended angles on the sky 
transmittance 
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A 


Abbe number, 177 
Aberration 


astigmatism, 74-77, 86, 87, 92 
character of, 86 
chromatic, 65—68, 151, 167-170 
classical, 261-263 
coma, 82, 83, 86 
distortion, 78, 86 
field curvature, 97-103 
introduction to, 50-61 
multi-surface systems, 93-95 
orthogonal, 264-266 
ray and wavefront, 78-84 
spherical 
fifth-order, 59, 174 
third-order, 51-57, 86 


Aberration, angular, see Angular aberration 
Aberration coefficients 


decentered pupil, 105 
displaced stop or pupil, 91 
grating, 358-360 
plane-parallel plate, 172 
prism, 398 
stop at surface 
general surface, 84 
mirror, 85 


Aberration coefficients, element 
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aspheric Schmidt plate, 166, 172, 173 
concentric meniscus lens, 199 
diffraction grating, 360, 399 
grism, 403 
prism, 398 
Schmidt-Cassegrain 
primary, 187 
secondary, 188 
Aberration coefficients, grating mounting 
Czerny-Turner 
monochromator, 370 
spectrograph, 372 
nonobjective 
grating, 399 
grism, 403 
prism-grating, 406 
Aberration coefficients, system 
atmospheric dispersion corrector, 233 
Bouwers camera, 199 
Cassegrain focus corrector, 216 
grism, 403 
prime focus corrector, 211, 214 
prism-grating, 406 
Schmidt 
focal reducer, 223 
general, 165 
Schmidt-Cassegrain 
all-spherical, 194 
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Aberration coefficients, system (cont.) 
flat field anastigmat, 190 
general, 189 

three-mirror telescope, 145 
Paul-Baker, 147 

two-mirror telescope 
afocal, 132 
corrected Ritchey-Chretien, 216 
general, 117 
hybrid, 130 
with misaligned secondary, 134 

Aberration, longitudinal, see Longitudinal 

aberration 

Aberration, transverse, see Transverse 

aberration 
Achromatic corrector 
Maksutov, 202 
Schmidt, 177-181 

Active optics, 258 

Adaptive optics, 258, 409-424 
deformable mirrors, 422 
tip-tilt correction, 417 
wavefront sensor, 422 

Afocal telescope, see Telescope type 

Airy disk, 248, 250, 256 

Airy pattern 
asympotic form, 251 
encircled energy, 252-255 
general, 248 
peak intensity, 252, 256 
point spread function, 246-250 
radii of dark rings, 250 
Strehl ratio, 260 

Alignment errors 

aberration coefficients, 134, 138 

decenter, 104, 133 

despace, 143 

image shift, 134 

Schmidt camera, 105-107 

tilt, 104, 133 

two-mirror telescope, 132-144 
Anamorphic magnification 

definition, 308 

diffraction grating, 325 

prism, 321 
Anastigmatic telescope 

Schmidt-Cassegrain, 190-193 

two-mirror, 124-126, 131 
Angular aberration 

corrected Ritchey-Chretien, 212 
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definition, 52, 60 
fifth-order spherical 
aspheric plate, 174 
conic mirror, 59 
spherical mirror, 174 
paraboloid, 113 
prime focus plus aspheric plate, 217 
relation to transverse, 80 
Schmidt with misaligned plate, 105-107 
Schmidt-Cassegrain aplanat, 194 
two-mirror telescope 
aplanatic, 122 
classical, 119 
despaced secondary, 143, 144 
general, 118 
hybrid, 131 
misaligned secondary, 135, 139 
Angular dispersion, see Dispersion 
Angular magnification, see Magnification 
Angular resolution, 46, 257 
limit due to atmosphere, 411-415 
limit due to diffraction, 257, 410 
Aperture function 
definition, 272 
example, spider, 272 
Aperture stop, 22 
see also Stop; Pupil 
Aplanatic condition, 87 
Aplanatic telescope, 121-123, 126-129 
Array, 451-457 
general, 452 
linear, 453 
square, 456 
theorem, 274, 452 
Aspheric corrector 
aberration coefficients, 106, 171 
achromatic, 177-180 
aspheric coefficient, 166, 173 
chromatic aberration, 65-68, 167-170 
fifth-order spherical, 174 
prime focus, 211-214 
radius of curvature, 67, 169 
Ritchey-Chretien focus, 216-219 
size, 170 
surface figure, 67, 167 
surface parameters, 167 
zonal deviations, 168 
Astigmatism, see also Aberration; Aberration 
coefficients 
angular, 60, 86 


misalignment, 138-142 
sagittal, 74, 86, 92 
tangential, 74, 86, 92 
transverse, 77, 86 
Atmosphere 
dispersion corrector, 225-237 
degradation function, 412 
differential refraction, 43, 226 
Fried parameter, 413, 416 
isoplanatic angle, 416 
long-exposure modulation transfer function 
(MTF), 412 
refraction, 43 
turbulence, 44 
Kolmogorov, 412-415 
scintillation, 410 
seeing, 44, 410 
speckle, 411 
Atmospheric dispersion corrector, 225-237 
chromatic focal error, 229, 231 
in collimated light, 227 
in convergent light, 228 
in Ritchey-Chretien telescope, 235 
Autocorrelation, 281, 283, 286 


B 
Back focal distance, 18, 115 
Baker-Schmidt telescope 
aberration coefficients, 190 
chromatic aberration, 192 
conic constants, 191 
Blaze, grating 
angle, 328, 334 
function, 332-338 
wavelength, 334 
Bouwers camera, 198—202 
aberration coefficient, 199 
chromatic aberration, 201 
concentric meniscus corrector, 198 
parameters, 201 
Brightness, see also Intensity; Irradiance, 312 
extended source, 314 
stellar source, 314 


Camera 
Bouwers, 198-202 
Maksutov, 202 
Schmidt, 64—68, 164-184 


Index 469 


Schmidt-Cassegrain, 185—197 
Cassegrain telescope, see also Telescope type 
aberration coefficients, 117 
angular aberrations, 118, 119, 122 
diameter of secondary, 21 
exit pupil, 23, 24 
power, 19 
type 
classical, 61-64, 119-121 
Couder, 125 
Dall-Kirkham, 63, 123 
hybrid, 129-131 
Ritchey-Chretien, 121-123 
Schwarzschild, 125 
Catadioptric telescope, see also Schmidt- 
Cassegrain camera 
Bouwers, 198-202 
Maksutov, 202 
Schmidt 
solid, semi-solid, 181-184 
Schmidt-Cassegrain, 185-187 
Chief ray, 24, 72 
Chromatic aberration 
meniscus corrector, 201 
Schmidt camera, 66—68, 167—170 
Schmidt-Cassegrain, 192 
solid Schmidt, 182 
Classical Cassegrain, see Cassegrain telescope 
Collimator 
paraboloid 
off-axis, 379 
on-axis, 380 
spherical mirror, 379 
Coma, see also Aberration; Aberration 
coefficients 
angular, 60, 86 
misalignment, 106, 135-138 
sagittal, 82, 87, 92 
spectral, 376, 400 
tangential, 82, 87, 92 
transverse, 82, 86 
Conic constant 
definition, 41 
relation to eccentricity, 41 
Conic section, 41 
Conjugate points, 10 
Contrast, 278 
Coude focus, see Focus 
Couder telescope, see Cassegrain telescope 
Cross dispersion, 385-392 
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Cross dispersion (cont.) 
grating, 387 
modes, 388-391 
prism, 387 
Curvature 
in inhomogeneous medium, 30 
of field, see Field curvature 
local, on surface, 42, 445 
spectrum line, 356 
Cutoff frequency, 278, 280 
see also Modulation transfer function 
Czerny-Tumer mounting, 369-374 
monochromator, 371 
spectrograph, 373 


D 
Dall-Kirkham telescope, see Cassegrain 
telescope 
Decenter, see Alignment errors 
Defocus, 261, 263, 266 
Degradation function, see also Modulation 
transfer function, 285-288 
atmosphere, 412 
correlation length, 286 
microripple, 286 
mid-frequency ripple, 286 
root-mean-square pointing error, 288 
Despace, see Alignment errors 
Detection limits, 435-443 
stellar photometry, 438 
stellar spectroscopy, 440-443 
slit-limited, 440 
slitless, 442 
Detector 
charge coupled device (CCD), 426 
dark count, 427 
read noise, 427 
infrared, 426 
modulation transfer function, 427-431 
quantum efficiency, 427 
signal-to-noise ratio, 433-435 
Differential refraction, see Atmosphere 
Diffraction 
focus, 262, 268 
shift from paraxial focus, 263 
Fraunhofer, see Fraunhofer diffraction 
image 
annular aperture, 248—257 
array, 452-457 
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circular aperture, 248-257 
rectangular aperture, 243-246 
slit, 246 
integral 
array, 452 
in presence of aberrations, 259, 272 
perfect, 243, 248 
limit, 46, 250, 257, 270 
variables, 243, 247, 250 
Diffraction grating 
anamorphic magnification, 325 
angles, 323 
angular dispersion, 324 
blaze 
angle, 328, 334 
function, 332-338 
wavelength, 334 
constant, 323 
efficiency, 331-339 
equation, 323, 355 
free spectral range, 327 
grism, 356, 402-405 
holographic 
surface, 341 
volume-phase, 342 
interference function, 333 
Littrow mode, 328 
luminosity-resolution product, 339 
polarization, 339 
reflection, 323, 355 
resolving power, 326 
resolving power-slit width product, 339 
sign convention, 323, 324, 355 
transmission, 324, 355 
Diffraction grating mounting, see also 
Aberration coefficients, element; 
Aberration coefficients, grating mounting; 
Spectrometer type 
concave 
inverse Wadsworth, 365 
Rowland, 361-363 
Wadsworth, 361, 364 
plane 
Czerny-Turner, 369-374 
Ebert-Fastie, 374 
Monk-Gillieson, 375-377 
transmission 
nonobjective, 399, 405 
Dispersion 
angular, 35, 305 


prism, 321 
grating, 324 
curves, glass, 322 
linear, 306 
Distortion, see Aberration; Aberration 
coefficients 


E 
Ebert-Fastie 
grating mounting, 374 
as 1:1 reimager, 107-109 
Eccentricity, 41 
Echelle, 325, 327-331, 384-396 
blaze 
angle, 334, 337 
function, 337 
wavelength, 334 
blaze peak 
efficiency, 337-339 
equations, 386 
cross-dispersion modes, 388-391 
design example, 394 
effective groove width, 337 
free spectral range, 385 
immersed, 330, 395 
Littrow mode, 328 
order 
length, 385 
separation, 387 
tilt, 386 
overfilling, 341 
polarization, 339 
R-value, 328 
spectrometer configuration 
in-plane, 392 
off-plane, 392 
white pupil, 393 
spectrum format, 385-388 
Echellette, see also Diffraction grating, 325 
crossed, 396 
Ellipsoid, 39, 41 
Encircled energy (EE), 252-255 
annular aperture, 253 
asymptotic approximation, 255 
circular aperture, 253 
definition, 252 
in presence of 
figure error, 269 
random wavefront error, 289 
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rms pointing error, 291 
small errors, 449 
relation to modulation transfer function 
(MTF), 282 
Enclosed energy, see also Encircled energy, 245 
Entrance pupil, 22 
Etendue 
definition, 312, 313 
diffraction limit, 319 
Fabry-Perot, 345 
Fourier transform spectrometer, 349 
spectrometer, 313 
Exit pupil 
definition, 23 
two-mirror telescope, 23, 24 


F 
Fabry lens, 210 
Fabry-Perot interferometer, 342—347 
Airy function, 343 
comparison with echelle, 346 
etendue, 345 
free spectral range, 344 
luminosity-resolution product, 345 
reflective finesse, 344 
spectral purity, 344 
spectral resolution, 344 
type 
imaging, 345 
scanning, 344 
Fermat’s Principle 
aberration compensation example 
Cassegrain telescope, 61—64 
Schmidt camera, 64—68 
application to, 
atmosphere, 42-45 
conic mirrors, 37-41 
diffraction grating, 353-356 
general surface, 70-74 
prism, 34 
spherical surface, 32 
thin lens, 33 
general statement, 28 
laws of refraction & reflection, 31, 32 
physical interpretation, 36 
Fiber optics, 237-239 
focal ratio degradation (FRD), 238, 317 
as input 
integral field spectrometer (IFS), 305 
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Fiber optics (cont.) 
multiple object spectrometer (MOS), 305, 
383 
Field curvature, 97-103 
introduction to, 60 
median, 101 
Petzval, 98 
sagittal, 100 
tangential, 100 
Field curvature, system 
grating, mountings 
Czemy-Turner, 374 
Monk-Gillieson, 362 
nonobjective, 399 
Rowland, 362 
Wadsworth, 362 
prime focus with corrector, 212 
Schmidt-Cassegrain aplanat, 194 
spherical mirror, 101 
two-mirror telescope 
aplanatic, 122 
classical, 119 
general, 118 
Field flattener lens 
aberrations, 207 
Ritchey-Chretien telescope, 207 
Schmidt camera, 102, 208-210 
Field lens, 207, 210 
Field stop, 22 
Flux, 256, 312, 313, 315, 435, 436 
Flux-resolution product, see also Luminosity- 
resolution product 
diffraction limit, 319 
general, 314 
Focal length 
mirror, 13, 38 
thick lens, 15 
thin lens, 15, 33 
two-mirror telescope, 19 
Focal ratio, 18 
Focal reducer 
general configuration, 220-222 
Schmidt camera example, 222-225 
types, 222 
Focus 
Cassegrain, 18, 207 
coude, 131, 407 
diffraction, 262 
Gregorian, 18 
Nasmyth, 131, 393, 407 


prime, 210 
Folded Schmidt camera, 382 
Fourier transform, see Fraunhofer diffraction 
Fourier transform spectrometer, 347-350 
Four-mirror telescope, 154—161 
examples, 155-158 
pupil alignment, 159-161 
Fraunhofer diffraction 
annular aperture, 246-252 
aperture function, 272 
circular aperture, 246-252 
definition, 242 
Fourier transform, 272 
rectangular aperture, 242-246 
slit, 246 
Free spectral range 
diffraction grating, 326 
Fabry-Perot, 344 
Frequency, see Spatial frequency 
Full Width Half Maximum, see Point spread 
function 


G 

Gaussian equation, see Paraxial equation 
Gaussian profile 

image motion, 288 

intensity, 411 

modulation transfer function, 412 
Glass 

Abbe number, 177 

dispersion, 322 

index of refraction, 28, 35 
Grating, see Diffraction grating 
Gregorian telescope, see also Telescope type 

aberration coefficients, 117 

angular aberrations, 118, 119, 122 

aplanatic, 121-123 

classical, 119-121 

exit pupil, 23, 24 

power, 19 
Grism, 356, 402-405 


H 
Holographic grating, see Diffraction grating 
Hubble Space Telescope 
astigmatism, 270 
COSTAR, 301 
image characteristics 
as-built, 298-302 


predicted, 291-297 
intensity, 256, 257 
optical fix, 159, 160, 301 
original instrument complement, 292 
parameters, 292 
replacement instruments 
advanced camera system (ACS), 430, 
431, 438 
near infrared camera and multiple object 
spectrometer (NICMOS), 430, 431 
Huygens-Fresnel principle, 241 
diffraction integral, 242, 259, 272 
Hybrid telescope, 129-131 
Hyperboloid, 40, 41 


Image 
brightness, see Intensity; Irradiance 
diffraction, see Diffraction image 
sharpness, 432 
slicer, 316 
Index of refraction, 8, 28, 35 
Intensity, see also Irradiance, 255-257 
asymptotic average, 245, 252 
average over Airy disk, 256 
definition, 243 
multiple aperture telescope 
linear array, 454 
square array, 456 
normalized, 244, 248 
peak, 252, 256 
radial dependence, 252 
Strehl, 260 
Interferometer 
Fabry-Perot, 342-347 
Michelson, 347 
telescope array, 451-457 
Twyman-Green, 446, 447 
Inverse Wadsworth, see Diffraction grating 
mounting 
Irradiance, see also Intensity, 255-257, 315 
definition, 256, 315 
pixel, 315 
spectral, 315 


L 
Lagrange invariant, 12, 313 
Lateral magnification, see Magnification 
Limit of resolution, see Resolution 
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Limiting magnitude 
general form, 436, 440 
stellar photometry 
background-limited, 437 
signal-limited, 437 
slit-limited spectroscopy 
extended source, 441 
stellar source, 440 
slitless spectroscopy, 442 
Longitudinal aberration 
chromatic, 151, 201, 229 
spherical, 51 
Luminosity, see also Etendue, 312, 313 
Luminosity-resolution product, see also Flux- 
resolution product 
diffraction limit, 319 
Fabry-Perot, 345 
Fourier transform spectrometer, 349 
general, 314 
grating, 339 


M 
Magnification 
anamorphic, see Anamorphic magnification 
angular, 11 
lateral or transverse 
mirror, 13 
refracting surface, 11 
sign, 11, 19 
thin lens, 15 
pupil, 160 
Maksutov camera, 202-204 
Meniscus corrector 
achromatic, 202 
concentric, 198—202 
Michelson interferometer, 347 
Mirror 
print-through, 449 
residual error, 448 
shaping, 445 
testing, 446 
Modulation transfer function (MTF), 277-291, 
411-415, 427-431 
annular aperture, 282-284 
atmosphere, 411—415 
cutoff frequency, 280 
definition, 278 
and degradation functions, 285-290 
in presence of aberrations, 283-285 


474 


Modulation transfer function (MTF) (cont.) 
pixel, 427-431 
relation to 
encircled energy, 282 
point spread function, 282 
Monk-Gillieson mounting, 375-377 
Monochromator, see also Spectrometer type 
Czerny-Turner, 369-371 
Ebert-Fastie, 374 
Monk-Gillieson, 375-377 
Multiple aperture telescope, see Array 


N 
Nasmyth focus, see Focus 
Neutral point, 136 
Noise, see also Signal-to-noise ratio 
dark count, 427, 435 
photon, 433 
read, 427, 435 
Nonobjective 
grating, 399-402 
grism, 402-405 
prism, 397-399 
prism-grating, 405, 406 
Normalized intensity, see Strehl ratio 
Normalized parameters, two-mirror telescopes 
definitions, 18, 115 
table, 18, 115 
Nyquist criterion 
image sharpness, 432 
pixel matching, 377 
sampling, 310, 429-431 


o 
Objective prism, 322 
Objective mode, 318, 322 
Obscuration, 21, 128 
Off-axis paraboloid, see Paraboloid 
Optical path difference (OPD) 
diffraction grating, 358 
displaced stop, 89 
multi-surface system, 93 
relation to transverse aberration, 79, 94 
single surface, 73, 79 
Optical path length (OPL) 
general, 28 
grating, 353-355 
refracting surface, 31, 32, 71-73 
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Optical transfer function (OTF), see Transfer 
function 


P 
Paraboloid, 38, 113—115 
angular aberrations, 113, 114 
image surface curvature, 113 
off-axis, 379 
reimager at 1:1, 107-109 
Paraxial approximation, 9 
Paraxial equation 
mirror, 12 
refracting surface, 10, 33 
thin lens, 15 
Paul-Baker telescope, 145-152 
Petzval 
curvature, 98, 99 
spherical mirror, 101 
plus thin lens, 102 
surface, 98 
two-mirror telescope, 103 
Phase transfer function, see Transfer function 
Photometry, 438, 439 
Pixel, 315, 317, 426 
irradiance, 315 
matching, 317 
Pixel modulation transfer function 
approximation, 431 
ideal square well, 427-431 
Plane-parallel plate 
aberration coefficients, 172 
image displacement, 16 
Plate factor, 307 
Point spread function (PSF), 243-252, 282-290 
annular aperture, 246 
array, 454, 456 
asymptotic approximation, 245, 252 
encircled energy, see Encircled energy 
enclosed energy, 245 
full width half maximum (FWHM), 244, 251 
Gaussian profile, 411 
in presence of aberrations, 266-269 
relation to modulation transfer function, 282 
with random 
rms pointing error, 290 
wavefront error, 288 
rectangular aperture, 243-246 
Power 
mirror, 13 


refracting surface, 10 
separated thin lens doublet, 15 
thick lens, 15 
thin lens, 15 
two-mirror telescope, 19 
Prime focus telescope, 113-115 
Prime focus corrector, 210-216 
aspheric plate, 211-213 
multiple aspheric plates, 214 
Wynne triplet, 215 
Print-through, 449-451 
Prism 
aberration coefficients, 397 
angular dispersion, 35, 321 
deviation, 66 
transverse aberrations, 233, 234 
zero-deviation, 225, 234 
Pupil 
alignment, 159-161 
entrance, 22 
exit, 23 
decentered, 103—105 
displaced, 88—90 
magnification, 160 
shear, 160 
two-mirror telescope, 23, 24 


Q 
Quantum efficiency, 427 


R 
Radius of curvature 
sign convention, 8 
image surfaces, 97~103 
spectral image, 357 
Ray coordinate system, 8, 70-72, 353-355 
Rayleigh criterion, 257, 271, 280, 318 
Reflection, law of, 12, 31 
Refraction 
atmospheric, 43 
constant of, 43 
differential atmospheric, 43, 225, 226 
law of, 9, 31, 73 
Resolution, see also Spectral resolving power 
angular, 46, 257, 280, 455 
atmospheric limit, typical, 114, 410-415 
diffraction limit, 257, 280, 318-320 
limit of 
slit spectroscopy, 310 
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slitless spectroscopy, 318 

Resolving power, see Spectral resolving power 
Ritchey-Chretien telescope 

angular aberrations, 122 

conic constants, 121 

corrected at Cassegrain focus, 216-219 

field curvature, 122 

field-flattened, 207 

misaligned, 135-142 

modified flat-field, 219 
Root-mean-square wavefront error, see 

Wavefront 

Rowland mounting, 360-363 


S 
Sagittal 
astigmatic image location, 74, 92 
astigmatism, 75,77 
coma, 82, 87, 92 
image surface curvature, 100, 102 
Scale, telescope, 20 
Schmidt camera, 64—68, 102, 164-184 
aberration coefficients, 165 
with misaligned corrector, 102 
achromatic, 177-181 
all-reflecting, 204 
aspheric coefficient, 67, 166, 173 
chromatic aberration, 66-68, 167—170 
field-flattened, 102, 208-210 
fifth-order spherical aberration, 174 
focal reducer, 222-225 
folded, 382 
solid, semi-solid, 181—184 
Schmidt telescope, see Schmidt camera 
Schmidt-Cassegrain camera, 185-197 
aberration coefficients 
flat-field, 190 
general, 189 
all-spherical mirrors, 193—195 
anastigmatic flat-field, 190-193 
aplanat, 195 
chromatic aberration, 192 
compact, 195-197 
Schwarzschild telescope, see Telescope type 
Secondary mirror 
alignment errors and aberrations, 132-144 
diameter 
Schmidt-Cassegrain, 190 
two-mirror, 21 
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Secondary mirror (cont.) 
distance from focus, 21 
neutral point, 136 
Seeing, atmospheric, see Atmosphere 
Segmented mirror telescope, 162, 445 
Seidel coefficients, 111 
Semi-solid Schmidt camera, see Schmidt 
camera 
Shift, focal surface, 20, 21 
Sign convention, 9, 355 
angles, 9 
distances, 8 
indices of refraction, 13 
surface radii, 8, 97 
Signal-to-noise ratio (SNR), 433-438 
background-limited case, 434, 437 
definition, 433 
detector-limited case, 435 
fractional accuracy, 433 
ideal case, 433 
signal-limited case, 434, 437 
Slit spectroscopy, limiting magnitude 
extended source, 441 
stellar source, 440 
Slitless spectroscopy 
limiting magnitude, 443 
spectral purity, 318 
Snell’s law 
reflection, 12, 31 
refraction, 9, 31, 73 
Solid Schmidt camera, see Schmidt camera 
Space Telescope, see Hubble Space Telescope 
Spatial frequency, 278-280 
cutoff, 278, 280 
normalized, 280 
Speckle, 411 
Spectral purity 
definition, 310 
diffraction limit, 319 
slitless mode, 318 
Spectral resolving power 
definition, 311 
diffraction limit, 319 
Fabry-Perot, 344 
grating, 326 
prism, 322 
Spectrograph, see Spectrometer parameters; 
Spectrometer type 
Spectrometer parameters 
anamorphic magnification, 308, 321, 325 
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dispersion 
angular, 305, 321, 323 
linear, 306 
etendue, 312 
flux, 312 
flux-resolution product, 314 
free spectral range, 327, 344 
irradiance, 315 
limit of resolution, 310, 318 
luminosity, 312 
luminosity-resolution product, 314, 319 
plate factor, 307 
projected slit, 308, 318 
resolving power-slit width product, 339 
spectral purity, 310, 319 
spectral resolving power, 311, 317, 319, 322, 
326, 344 
spectrum line 
curvature, 356 
tilt, 356 
speed, 315 
Spectrometer type 
Czerny-Turner, 369-374 
Ebert-Fastie, 374 
echelle, see Echelle 
Fabry-Perot, 342-347 
fiber-fed, 317 
design example, 383 
Fourier transform, 347-350 
inverse Wadsworth, 365 
Monk-Gillieson, 375-377 
Rowland, 360-364 
slitless, nonobjective 
grating, 399 
grism, 402 
prism, 397 
prism-grating, 405 
Wadsworth, 360-362, 364 
Spectrum line 
curvature, 356 
tilt, 356 
Speed, see Spectrometer parameters 
Spherical aberration, see also Aberration; 
Aberration coefficients 
angular, 52 
circle of least confusion, 56 
definition, 50 
fifth-order conic, 59 
fifth-order, in collimated light 
aspheric plate, 174 


spherical mirror, 174 
longitudinal, 51 
third-order, 51, 59 
transverse, 50, 86 
Stop, see also Pupil 
aperture 
definition, 22 
displaced from surface, 88—90 
field, 22 
Stop-shift statements, 89, 90 
Strehl ratio, 260, 262, 263, 293, 296, 418 
Super-Schmidt camera, see Telescope type 
Surface curvature, image, 97—103 
median, 101 
Petzval, 98 
sagittal, 100 
tangential, 100 
Surface equation 
aspheric, 167 
conic mirror, 41, 42 
general, 50, 71 


T 
Tangential 
astigmatic image location 
general, 74, 92 
grating, 359 
astigmatism, 75, 77, 360 
coma, 82, 87, 92 
image surface curvature, 
general, 100,102 
grating, 362 
Telescope type 
afocal, 131 
all-reflecting, 204 
aplanatic 
flat-field, 126 
two-mirror, 121—123, 125 
anastigmatic 
Couder, 125 
Schwarzschild, 125 
array, see Array 
Baker-Schmidt, see Schmidt-Cassegrain 
camera 
Bouwers meniscus, 198-202 
Cassegrain, 17, 61-64, 119-121, 126-129 
classical, 61-64, 119-121 
Couder, 125 
Dall-Kirkham, 63, 123 
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four-mirror, 154-161 
Gregorian, 17, 126-129 
hybrid, 129-131 
inverse Cassegrain, 126 
Maksutov, 202 
paraboloid, 113-115 
Paul-Baker, 145-152 
prime focus corrected, 210-216 
Ritchey-Chretien, 121-123 
corrected, 216-219 
field-flattened, 207 
Schmidt, see Schmidt camera 
Schmidt-Cassegrain, see Schmidt-Cassegrain 
camera 
Schwarzschild, 125 
three-mirror, 144-153 
Thick lens, 14 
Thick plate, 16 
aberration coefficients, 172 
image displacement, 16 
Thin lens 
focal length, 15, 33, 34 
paraxial equation, 15 


power, 15 
Three-mirror telescope, 144-153 
Korsch 
flat-field, 152 
two-axis, 153 
Paul-Baker, 145—152 
Robb, 153 
Transfer function, see also Modulation transfer 
function 
definition 


modulation, 278 
optical, 280 
phase, 280 
relation to 
encircled energy, 282 
point spread function, 282 
Transverse aberration 
chromatic 
Schmidt camera, 66—68, 167-170 
Schmidt-Cassegrain camera, 192 
definitions, 50, 77, 79, 86 
diffraction grating, 359-361 
grating mounting 
Czerny-Turner 
monochromator, 371 
spectrograph, 373 
inverse Wadsworth, 366 
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Transverse aberration (cont.) 


Monk-Gillieson, 360, 376 
Rowland, 360 
Wadsworth, 360 
in limit of small object distance, 207 
multi-surface system, 93-95 
nonobjective mode 
grating, 400 
grism, 403 
prism, 233, 398 
plane-parallel plate, 232 
prism, doublet, 234 
relation to angular, 80 
single surface, stop at surface, 85 


Index 


parameter combinations, 116 
Petzval curvature, 103, 118 
power, 19 
Twyman-Green interferometer, 446, 447 


v 
Vignetting, 128, 149, 340, 383 


w 
Wadsworth mounting, 360-362, 364 
Wavefront 
aberration, 78-84, 258 
definition, 45 


Transverse magnification, see Magnification 
Turbulence, see Atmosphere 
Two-mirror telescope, see also Telescope type 


distortion 
Fried parameter, 413, 416 
isoplanatic angle, 416 


aberration coefficients, 117 
alignment errors, 132-144 

angular aberrations, 118, 119, 122 
comparison between types, 126-129 
conic constants, 119, 121 

image surface curvatures, 118 
neutral point, 136 

normalized parameters, 18, 115 


partial correction, 416—421 
root-mean-square error, 260-269 
Wynne triplet, 215, 216 


Z 
ZEMAX, 6 
Zernike polynominals, 264-269 


